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Recommended Books and Resources 


There are many good books on quantum mechanics. Here’s a selection that I like: 
e Griffiths, Introduction to Quantum Mechanics 


An excellent way to ease yourself into quantum mechanics, with uniformly clear expla- 
nations. For this course, it covers both approximation methods and scattering. 


e Shankar, Principles of Quantum Mechanics 
e James Binney and David Skinner, The Physics of Quantum Mechanics 


e Weinberg, Lectures on Quantum Mechanics 


These are all good books, giving plenty of detail and covering more advanced topics. 
Shankar is expansive, Binney and Skinner clear and concise. Weinberg likes his own 
notation more than you will like his notation, but it’s worth persevering. 


e John Preskill, Course on Quantum Computation 


Preskill’s online lecture course has become the default resource for topics on quantum 
foundations. 


A number of lecture notes are available on the web. Links can be found on the course 
webpage: http://www.damtp.cam.ac.uk/user/tong/topicsingm.html 
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The Cambridge mathematics tripos includes a course called “Applications of Quantum 
Mechanics”. It is something of a hybrid, containing some topics from these notes (the 
Variational Principle, and Scattering Theory), together with an introduction to Solid 
State Physics. I have chosen to split these into two, more traditionally-titled sets of 
notes, with Lectures on Solid State Physics carved off separately. 


If you’re a Cambridge student, the relevant chapters making up the lectures on 
Applications of Quantum Mechanics can be found here. 


0. Introduction 


“The true meaning of quantum mechanics can be found in the answers it 
gives about the world we inhabit.” 


Me, in a previous set of lecture notes. 


Our previous courses on quantum mechanics were largely focussed on understanding 
the mathematical formalism of the subject. The purpose of this course is to put this 
understanding to use. 


The applications of quantum mechanics are many and varied, and vast swathes of 
modern physics fall under this rubric. Many of these applications naturally fall into 
different lectures, such as Solid State Physics or Statistical Physics or, if we include 
relativity into the mix, Particle Physics and Quantum Field Theory. In these lectures 
we cover a number of topics that didn’t have such a natural home. This means that 
we're left with something of a mishmash of topics. 


The first two chapters describe tools that are useful in the study of many different 
quantum system: they cover the role of discrete symmetries in quantum mechanics, 
and the use of approximation methods to solve quantum systems. Subsequent chapters 
are more focussed on specific quantum systems. 


We devote a significant amount of time to atomic physics. Current research in atomic 
physics is largely devoted to exquisitely precise manipulation of cold atoms, bending 
them to our will. Here, our focus is more old-fashioned and we look only at the 
foundations of the subject, including the detailed the spectrum of the hydrogen atom, 
and a few tentative steps towards understanding the structure of many-electron atoms. 
We also describe the various responses of atoms to electromagnetic prodding. 


We devote one chapter of these notes to revisiting some of the foundational aspects 
of quantum mechanics, starting with the important role played by entanglement as a 
way to distinguish between a quantum and classical world. We will provide a more 
general view of the basic ideas of states and measurements, as well as an introduction 
to the quantum mechanics of open systems. 


The final topic scattering theory. In the past century, physicists have developed a 
foolproof and powerful method to understand everything and anything: you take the 
object that you’re interested in and you throw something at it. This technique was 
pioneered by Rutherford who used it to understand the structure of the atom. It was 
used by Franklin, Crick and Watson to understand the structure of DNA. And, more 


recently, it was used at the LHC to demonstrate the existence of the Higgs boson. In 
fact, throwing stuff at other stuff is the single most important experimental method 
known to science. It underlies much of what we know about condensed matter physics 
and all of what we know about high-energy physics. 


In many ways, these lectures are where theoretical physics starts to fracture into 
separate sub-disciplines. Yet areas of physics which study systems separated by orders 
of magnitude — from the big bang, to stars, to materials, to information, to atoms and 
beyond — all rest on a common language and background. These lectures build this 
shared base of knowledge. 


1. Discrete Symmetries 


In this section, we discuss the implementation of discrete symmetries in quantum me- 
chanics. Our symmetries of choice are parity, a spatial reflection, and time reversal. 


1.1 Parity 


A cartoon picture of parity is to take a state and turn it into its image as seen in a 
mirror. This is best viewed as an action on space itself. In three spatial dimensions, 
we usually take parity to act as 


P:xrH-x (1.1) 


More generally, in d spatial dimensions the parity operator is a linear map on the d 
spatial coordinates such that P € O(d) and det P = —1. This means, in particular, 
that the definition (1.1) is good whenever d is odd, but not good when d is even where 
it coincides with a rotation. A definition which works in all dimensions is simply 
P: xt |œ —-a2' and P : zt + xq’ for all i 4 1, which differs from (1.1) by a spatial 
rotation. 


Here we will restrict attention to d = 1 and d = 3, where the definition (1.1) is the 
standard one. We can use this to tell us how the classical state of a particle changes. 
Recall that, classically, the state of a particle is defined by a point (x, p) in phase space. 
Since p = mx, parity must act as 


P : (x,p) > (—x, —-p) (1.2) 


Here our interest lies in quantum mechanics so we want to introduce a parity operator 
which acts on the Hilbert space. We call this operator m. It is natural to define m by 
its action on the position basis, 


rx) = |-x) (1.3) 
This means that, when acting on wavefunctions, 


T : p(x) > Y(=x) 


Note that, in contrast to continuous symmetries, there is no one-parameter family of 
transformations. You don’t get to act by a little bit of parity: you either do it or 
you don’t. Recall that for continuous symmetries, the action on the Hilbert space is 
implemented by a unitary operator U while its infinitesimal form U 1 + ieT (with e 


small) yields the Hermitian operator T called the “generator”. In contrast, the parity 
operator 7 is both unitary and Hermitian. This follows from 


mae=1 and w=1 > n=r =r! (1.4) 


Given the action of parity on the classical state (1.2), we should now derive how it acts 
on any other states, for example the momentum basis |p). It’s not difficult to check 
that (1.3) implies 


™|p) = |-p) 


as we might expect from our classical intuition. This essentially follows because p = 
—ihO/Ox in the position representation. Alternatively, you can see it from the form of 
the plane waves. 


The Action of Parity on Operators 
We can also define the parity operator by its action on the operators. From our dis- 
cussion above, we have 


axn'=-—x and mrpr = -p 


Using this, together with (1.4), we can deduce the action of parity on the angular 
momentum operator L = x x p, 


aLat = +L (1.5) 


We can also ask how parity acts on the spin operator S. Because this is another form 
of angular momentum, we take 


TSrt = +S (1.6) 
This ensures that the total angular momentum J = L+S also transforms as tJ! = +J. 


In general, an object V which transforms under both rotations and parity in the 
same way as x, so that Va! = —V, is called a vector. (You may have heard this 
name before!) In contrast, an object like angular momentum which rotates like x but 
transforms under parity as 7Vz = +V is called a pseudo-vector. 


Similarly, an object K which is invariant under both rotations and parity, so that 
nKr' = K is called a scalar. However, if it is invariant under rotations but odd under 
parity, so nKrt = —K, is called a pseudo-scalar. An example of a pseudo-scalar in 
quantum mechanics is p- S. 


Although we’ve introduced these ideas in the context of quantum mechanics, they 
really descend from classical mechanics. There too, x and p are examples of vectors: 
they flip sign in a mirror. Meanwhile, L = x x p is a pseudo-vector: it remains pointing 
in the same direction in a mirror. In electromagnetism, the electric field E is a vector, 
while the magnetic field B is a pseudo-vector, 


P:ES-E , P:B& 4B 


1.1.1 Parity as a Quantum Number 


The fact that the parity operator is Hermitian means that it is, technically, an observ- 
able. More pertinently, we can find eigenstates of the parity operator 


me) = nel) 


where ny is called the parity of the state |y}. Using the fact that 7? = 1, we have 


mip) =v) =~) > m=! 


So the parity of a state can only take two values. States with ny = +1 are called parity 
even; those with ny = —1 parity odd. 


The parity eigenstates are particularly useful when parity commutes with the Hamil- 
tonian, 


tHrnt=H © [r,H]=0 


In this case, the energy eigenstates can be assigned definite parity. This follows im- 
mediately when the energy level is non-degenerate. But even when the energy level is 
degenerate, general theorems of linear algebra ensure that we can always pick a basis 
within the eigenspace which have definite parity. 


An Example: The Harmonic Oscillator 


As a simple example, let’s consider the one-dimensional harmonic oscillator. The 
Hamiltonian is 
1 1 
H = —p + =mw? 2? 

am? * 9 
The simplest way to build the Hilbert space is to introduce raising and lowering oper- 
ators a ~ (x + ip/mw) and at ~ (x — ip/mw) (up to a normalisation constant). The 
ground state |0} obeys a|0) = 0 while higher states are built by |n} ~ (a')"|0) (again, 
ignoring a normalisation constant). 


The Hamiltonian is invariant under parity: |r, H] = 0, which means that all energy 
eigenstates must have a definite parity. Since the creation operator at is linear in x and 
p, we have 


nalr = —al 
This means that the parity of the state |n + 1) is 
n\n +1) = ma" |n) = —a'n|n) => My = —Nn 


We learn that the excited states alternate in their parity. To see their absolute value, 
we need only determine the parity of the ground state. This is 


ols) = (x0) ~ exp (=) 


Since the ground state doesn’t change under reflection we have no = +1 and, in general, 
N = (=1)”. 
Another Example: Three-Dimensional Potentials 


In three-dimensions, the Hamiltonian takes the form 


h2 
H = -— V’ +V 1.7 
=V? + V(x) (1.7) 
This is invariant under parity whenever we have a central force, with the potential 
depending only on the distance from the origin: V(x) = V(r). In this case, the energy 
eigenstates are labelled by the triplet of quantum numbers n, l,m that are familiar from 
the hydrogen atom, and the wavefunctions take the form 


Vm (X) = Rna(t)¥im(9, $) (1.8) 


How do these transform under parity? First note that parity only acts on the spherical 
harmonics Y;,(,@). In spherical polar coordinates, parity acts as 


P:(7r,0,6) => (,7-0,64+7) 


The action of parity of the wavefunctions therefore depends on how the spherical har- 
monics transform under this change of coordinates. Up to a normalisation, the spherical 
harmonics are given by 


Yim ~ e"? P™(cos 8) 


where P/"(x) are the associated Legendre polynomials. As we will now argue, the 
transformation under parity is 


P : bay $) > Yim(T = 0, Q + T) = (—1)! Vial Py $) (1.9) 
This means that the wavefunction transforms as 


P: Ursin) > Wn, ta =X) = (—1)! Wntm(X) 


Equivalently, written in terms of the state |n,1,m), where Wnjm(x) = (x|n,l,m), we 
have 


n\n, l,m) = (—1)!' |n, l,m) (1.10) 


It remains to prove the parity of the spherical harmonic (1.9). There’s a trick here. 
We start by considering the case l = m where the spherical harmonics are particularly 
simple. Up to a normalisation factor, they take the form 


Yii(8, o) ~ e"? sin! 0 
So in this particular case, we have 
Jas Yıı(0, $) > Yil = 0, (o) + T) = un sin! (7 a 0) = (—1)' Yıı(0, o) 


confirming (1.9). To complete the result, we show that the parity of a state cannot 
depend on the quantum number m. This follows from the transformation of angular 
momentum (1.5) which can also be written as |r, L] = 0. But recall that we can 
change the quantum number m by acting with the raising and lowering operators 
L = Lz + iL}. So, for example, 


qin, l,l — 1) = rL|n, l,l) = L_aln,i,B = (-1)'E_|n, 1,0) = (—1)'|n, l,l — 1) 
Repeating this argument shows that (1.10) holds for all m. 


Parity and Spin 


We can also ask how parity acts on the spin states, |s,m,) of a particle. We know 
from (1.6) that the operator S is a pseudo-vector, and so obeys |r, S] = 0. The same 
argument that we used above for angular momentum L can be re-run here to tell us 
that the parity of the state cannot depend on the quantum number m,. It can, however, 
depend on the spin s, 


Ts, Ms) = Ns|S, Ms) 


What determines the value of 7,? Well, in the context of quantum mechanics nothing 
determines 7,! In most situations we are dealing with a bunch of particles all of the 


same spin (e.g. electrons, all of which have s = 5). Whether we choose 7, = +1 or 
Ns = —1 has no ultimate bearing on the physics. Given that it is arbitrary, we usually 
pick n, = +1. 


There is, however, a caveat to this story. Within the framework of quantum field 
theory it does make sense to assign different parity transformations to different particles. 
This is equivalent to deciding whether 7, = 1 or 7, = —1 for each particle. We will 
discuss this in Section 1.1.2. 


What is Parity Good For? 


We've learned that if we have a Hamiltonian that obeys |r, H] = 0, then we can 
assign each energy eigenstate a sign, +1, corresponding to whether it is even or odd 
under parity. But, beyond gaining a rough understanding of what wavefunction in 
one-dimension look like, we haven’t yet said why this is a useful thing to do. Here we 
advertise some later results that will hinge on this: 


e There are situations where one starts with a Hamiltonian that is invariant under 
parity and adds a parity-breaking perturbation. The most common situation is 
to take an electron with Hamiltonian (1.7) and turn on a constant electric field 
E, so the new Hamiltonian reads 


h? 
|. nee ve = 5 
w +V(r)—ex-E 


This no longer preserves parity. For small electric fields, we can solve this using 
perturbation theory. However, this is greatly simplified by the fact that the orig- 
inal eigenstates have a parity quantum number. Indeed, in nearly all situations 
first-order perturbation theory can be shown to vanish completely. We will de- 
scribe this in some detail in Section 4.1 where we look at a hydrogen atom in an 
electric field and the resulting Stark effect. 


e In atomic physics, electrons sitting in higher states will often drop down to lower 
states, emitting a photon as they go. This is the subject of spectroscopy. It was 
one of the driving forces behind the original development of quantum mechanics 
and will be described in some detail in Section 4.3. But it turns out that an 
electron in one level can’t drop down to any of the lower levels: there are selection 
rules which say that only certain transitions are allowed. These selection rules 
follow from the “conservation of parity”. The final state must have the same 
parity as the initial state. 


e It is often useful to organise degenerate energy levels into a basis of parity eigen- 
states. If nothing else, it tends to make calculations much more straightforward. 
We will see an example of this in Section 6.1.3 where we discuss scattering in one 
dimension. 


1.1.2 Intrinsic Parity 


There is a sense in which every kind particle can be assigned a parity +1. This is called 
intrinsic parity. To understand this, we really need to move beyond the framework of 
non-relativistic quantum mechanics and into the framework of quantum field theory 


The key idea of quantum field theory is that the particles are ripples of an underlying 
field, tied into little bundles of energy by quantum mechanics. Whereas in quantum 
mechanics, the number of particles is fixed, in quantum field theory the Hilbert space 
(sometimes called a Fock space) contains states with different particle numbers. This 
allows us to describe various phenomena where we smash two particles together and 
many emerge. 


In quantum field theory, every particle is described by some particular state in the 
Hilbert space. And, just as we assigned a parity eigenvalue to each state above, it 
makes sense to assign a parity eigenvalue to each kind of particle. 


To determine the total parity of a configuration of particles in their centre-of-momentum 
frame, we multiply the intrinsic parities together with the angular momentum parity. 
For example, if two particles A and B have intrinsic parity 74 and np and relative 
angular momentum L, then the total parity is 


n = nAne(—1)” 


To give some examples: by convention, the most familiar spin-5 particles all have even 
parity: 


electron: ne = +1 


proton: Mp = +1 


neutron: Mn = +1 


Each of these has an anti-particle. (The anti-electron is called the positron; the others 
have the more mundane names anti-proton and anti-neutron). Anti-particles always 
have opposite quantum numbers to particles and parity is no exception: they all have 
n==]:; 


All other particles are also assigned an intrinsic parity. As long as the underlying 
Hamiltonian is invariant under parity, all processes must conserve parity. This is a 
useful handle to understand what processes are allowed. It is especially useful when 
discussing the strong interactions where the elementary quarks can bind into a bewil- 
dering number of other particles — protons and neutrons, but also pions and kaons and 
etas and rho mesons and omegas and sigmas and deltas. As you can see, the names are 
not particularly imaginative. There are hundreds of these particles. Collectively they 
go by the name hadrons. 


Often the intrinsic parity of a given hadron can be determined experimentally by 
observing a decay process. Knowing that parity is conserved uniquely fixes the parity 
of the particle of interest. Other decay processes must then be consistent with this. 


An Example: a d —> nn 


The simplest of the hadrons are a set of particles called pions. We now know that each 
contains a quark-anti-quark pair. Apart from the proton and neutron, these are the 
longest lived of the hadrons. 


The pions come in three types: neutral, charge +1 and charge —1 (in units where 
the electron has charge —1). They are labelled 7°, t+ and m~ respectively. The 77 
is observed experimentally to decay when it scatters off a deuteron, d, which is stable 
bound state of a proton and neutron. (We showed the existence of a such a bound 
state in Section 2.1.3 as an application of the variational method.). After scattering 
off a deuteron, the end product is two neutrons. We write this process rather like a 
chemical reaction 


mT d > nn 


From this, we can determine the intrinsic parity of the pion. First, we need some facts. 
The pion has spin s+ = 0 and the deuteron has spin sg = 1; the constituent proton 
and neutron have no orbital angular momentum so the total angular momentum of 
the deuteron is also J = 1. Finally, the pion scatters off the deuteron in the s-wave, 
meaning that the combined m~ d system that we start with has vanishing orbital angular 
momentum. From all of this, we know that the total angular momentum of the initial 
state is J = 1. 


Since angular momentum is conserved, the final nn state must also have J = 1. 
Each individual neutron has spin Sn = Z. But there are two possibilities to get J = 1: 


e The spins could be anti-aligned, so that S = 0. Now the orbital angular momen- 
tum must be L= 1. 


= ]0 = 


e The spins could be aligned, so that the total spin is S = 1. In this case the orbital 
angular momentum of the neutrons could be L = 0 or L = 1 or L = 2. Recall 
that the total angular momentum J = L +S ranges from |L — S| to |L + S and 
so for each of L = 0,1 and 2 it contains the possibility of a J = 1 state. 


How do we distinguish between these? It turns out that only one of these possibilities is 
consistent with the fermionic nature of the neutrons. Because the end state contains two 
identical fermions, the overall wavefunction must be anti-symmetric under exchange. 
Let’s first consider the case where the neutron spins are anti-aligned, so that their total 
spin is S = 0. The spin wavefunction is 


IT) = It) 
V2 


which is anti-symmetric. This means that the spatial wavefunction must be symmetric. 


|S =0) = 


But this requires that the total angular momentum is even: L = 0,2,.... We see that 
this is inconsistent with the conservation of angular momentum. We can therefore rule 
out the spin S = 0 scenario. 


(An aside: the statement that wavefunctions are symmetric under interchange of 
particles only if L is even follows from the transformation of the spherical harmon- 
ics under parity (1.9). Now the polar coordinates (r,6,@) parameterise the rela- 
tive separation between particles. Interchange of particles is then implemented by 
(r,0,0) > (r,t —0,64+7).) 


Let’s now move onto the second option where the total spin of neutrons is S = 1. 
Here the spin wavefunctions are symmetric, with the three choices depending on the 
quantum number m, = —1,0, +1, 


IT) IIT) 
ae: ; 


Once again, the total wavefunction must be anti-symmetric, which means that the 


IS=1,1)=|T)|t) , [8=1,0)= IS =1,-1) = |494) 


spatial part must be anti-symmetric. This, in turn, requires that the orbital angular 
momentum of the two neutrons is odd: L = 1,3,.... Looking at the options consistent 
with angular momentum conservation, we see that only the L = 1 state is allowed. 


Having figured out the angular momentum, we’re now in a position to discuss parity. 
The parity of each neutron is 7, = +1. The parity of the proton is also 7, = +1 and 
since these two particles have no angular momentum in their deuteron bound state, we 
have a = mp = +1. Conservation of parity then tells us 


NNa = (h) (1) > h=-1 


sl 


Parity and the Fundamental Forces 


Above, I said that parity is conserved if the underlying Hamiltonian is invariant under 
parity. So one can ask: are the fundamental laws of physics, at least as we currently 
know them, invariant under parity? The answer is: some of them are. But not all. 


In our current understanding of the laws of physics, there are five different ways in 
which particles can interact: through gravity, electromagnetism, the weak nuclear force, 
the strong nuclear force and, finally, through the Higgs field. The first four of these are 
usually referred to as “fundamental forces”, while the Higgs field is kept separate. For 
what it’s worth, the Higgs has more in common with three of the forces than gravity 
does and one could make an argument that it too should be considered a “force”. 


Of these five interactions, four appear to be invariant under parity. The misfit is 
the weak interaction. This is not invariant under parity, which means that any process 
which occur through the weak interaction — such as beta decay — need not conserve 
parity. Violation of parity in experiments was first observed by Chien-Shiung Wu in 
1956. 


To the best of our knowledge, the Hamiltonians describing the other four interactions 
are invariant under parity. In many processes — including the pion decay described 
above — the strong force is at play and the weak force plays no role. In these cases, 
parity is conserved. 


1.2 Time Reversal Invariance 
Time reversal holds a rather special position in quantum mechanics. As we will see, it 


is not like other symmetries. 


The idea of time reversal is simple: take a movie of the system in motion and play 
it backwards. If the system is invariant under the symmetry of time reversal, then the 
dynamics you see on the screen as the movie runs backwards should also describe a 
possible evolution of the system. Mathematically, this means that we should replace 
t+» —t in our equations and find another solution. 


Classical Mechanics 


Let’s first look at what this means in the context of classical mechanics. As our first 
example, consider the Newtonian equation of motion for a particle of mass m moving 
in a potential V, 


mx = -VV (x) 
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Such a system is invariant under time reversal: if x(t) is a solution, then so too is 
x(—t). 


As a second example, consider the same system but with the addition of a friction 
term. The equation of motion is now 


mx = -VV (x) — 7x 


This system is no longer time invariant. Physically, this should be clear: if you watch a 
movie of some guy sliding along in his socks until he comes to rest, it’s pretty obvious if 
it’s running forward in time or backwards in time. Mathematically, if x(t) is a solution, 
then x(—t) fails to be a solution because the equation of motion includes a term that 
is first order in the time derivative. 


At a deeper level, the first example above arises from a Hamiltonian while the second 
example, involving friction, does not. One might wonder if all Hamiltonian systems are 
time reversal invariant. This is not the case. As our final example, consider a particle 
of charge q moving in a magnetic field. The equation of motion is 


mk = qx x B (1.11) 


Once again, the equation of motion includes a term that is first order in time derivatives, 
which means that the time reversed motion is not a solution. This time it occurs because 
particles always move with a fixed handedness in the presence of a magnetic field: they 
either move clockwise or anti-clockwise in the plane perpendicular to B. 


Although the system described by (1.11) is not invariant under time reversal, if you’re 
shown a movie of the solution running backwards in time, then it won’t necessarily be 
obvious that this is unphysical. This is because the trajectory x(—t) does solve (1.11) if 
we also replace the magnetic field B with —B. For this reason, we sometimes say that 
the background magnetic field flips sign under time reversal. (Alternatively, we could 
choose to keep B unchanged, but flip the sign of the charge: q +> —q. The standard 
convention, however, is to keep charges unchanged under time reversal.) 


We can gather together how various quantities transform under time reversal, which 
we'll denote as T. Obviously T : t +> —t. Meanwhile, the standard dynamical variables, 
which include position x and momentum p = mx, transform as 


T: x(t) x(-t) , T: p(t)» —p(-t) (1.12) 


Finally, as we’ve seen, it can also useful to think about time reversal as acting on 
background fields. The electric field E and magnetic field B transform as 


TIERE , T:BH-B 
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These simple considerations will be useful as we turn to quantum mechanics. 


Quantum Mechanics 


We'll now try to implement these same ideas in quantum mechanics. As we will see, 
there is something of a subtlety. This is first apparent if we look as the time-dependent 
Schrodinger equation, 


in— = Hy (1.13) 


We’ll assume that the Hamiltonian H is invariant under time reversal. (For example, 
H = p?/2m + V(x).) One might naively think that the wavefunction should evolve 
in a manner compatible with time reversal. However, the Schrodinger equation is first 
order in time derivatives and this tells us something which seems to go against this 
intuition: if y(t) is a solution then y(—t) is not, in general, another solution. 


To emphasise this, note that the Schrodinger equation is not very different from the 

heat equation, 

Ow 2 

E RN 

ot Y 
This equation clearly isn’t time reversal invariant, a fact which underlies the entire 
subject of thermodynamics. The Schrödinger equation (1.13) only differs by a factor 
of i. How does that save us? Well, it ensures that if y(t) is a solution, then ~*(—t) is 


also a solution. This, then, is the action of time reversal on the wavefunction, 
T : Y(t) = y*(—t) (1.14) 


The need to include the complex conjugation is what distinguishes time reversal from 
other symmetries that we have met. 


How do we fit this into our general scheme to describe the action of symmetries on 
operators and states? We’re looking for an operator O such that the time reversal maps 
any state |Y} to 


T : |b) > Oly) 


Let’s think about what properties we want from the action of ©. Classically, the action 
of time reversal on the state of a system leaves the positions unchanged, but flips the 
sign of all the momenta, as we saw in (1.12). Roughly speaking, we want © to do the 
same thing to the quantum state. How can we achieve this? 
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Let’s first recall how we run a state forwards in time. The solution to (1.13) tells us 
that a state |wW(0)) evolves into a state |w(t)) by the usual unitary evolution 


w(t) = e *#*/* 1h (0)) 


Suppose now that we instead take the time reversed state O|v~(0)) and evolve this 
forward in time. If the Hamiltonian itself is time reversal invariant, the resulting state 
should be the time reversal of taking |~(0)) and evolving it backwards in time. (Or, 
said another way, it should be the time reversal of |y(t)), which is the same thing as 
Oļy(—t)}.) While that’s a mouthful in words, it’s simple to write in equations: we 
want O to satisfy 


Oe" (0) = e O (0) 
Expanding this out for infinitesimal time t, we get the requirement 
OiH = —iHO (1.15) 
Our job is to find a © obeying this property. 


At this point there’s a right way and a wrong way to proceed. Pl first describe the 
wrong way because it’s the most tempting path to take. It’s natural to manipulate 
(1.15) by cancelling the factor of i on both sides to leave us with 


0H+HO=0 ? (1.16) 


Although natural, this is wrong! It’s simple to see why. Suppose that we have an 
eigenstate |Y) obeying H|w) = E|w). Then (1.16) tells us that HO) = -OH |y) = 
—E|w). So every state of energy E must be accompanied by a time-reversed state 
of energy —E. But that’s clearly nonsense. We know it’s not true of the harmonic 
oscillator. 


So what did we do wrong? Well, the incorrect step was seemingly the most innocuous 
one: we are not allowed to cancel the factors of i on either side of (1.15). To see why, 
we need to step back and look at a little linear algebra. 

1.2.1 Time Reversal is an Anti-Unitary Operator 


Usually in quantum mechanics we deal with linear operators acting on the Hilbert 
space. The linearity means that the action of an operator A on superpositions of states 
is 


A(alq1) + Blp2)) = Aly) + BAly2) 
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with a, 8 € C. In contrast, an anti-linear operator B obeys the modified condition 


Balti) + Blv2)) = a* Bld) + B* Bly) (1.17) 


This complex conjugation is reminiscent of the transformation of the wavefunction 
(1.14) under time reversal. Indeed, we will soon see how they are related. 


The strange action (1.17) means that an anti-linear operator B doesn’t even commute 
with a constant a € C (which, here, we view as a particular simple operator which 
multiplies each state by a). Instead, when B is anti-linear we have 


Ba=a‘*B 


But this is exactly what we need to resolve the problem that we found above. If we 
take © to be an anti-linear operator then the factor of i on the left-hand-side of (1.15) 
is complex conjugated when we pull it through ©. This extra minus sign means that 
instead of (1.16), we find 


[©, H] = 0 (1.18) 


This looks more familiar. Indeed, we saw earlier that this usually implies we have a 
conserved quantity in the game. However, that will turn out not to be the case here: 
conserved quantities only arise when linear operators commute with H. Nonetheless, 
we will see that there are also some interesting consequences of (1.18) for time-reversal. 


We see above that we dodge a bullet if time reversal is enacted by an anti-linear 
operator ©. There is another, more direct, way to see that this has to be the case. 
This arises by considering its action on the operators x, and p. In analogy with the 
classical action (1.12), we require 


@xOt=x , Ope '=-p (1.19) 


However, quantum mechanics comes with a further requirement: the commutation re- 
lations between these operators should be preserved under time reversal. In particular, 
we must have 


Ti, pj) = thoy => ti, pO = Otho; )O~ 
| = thd le. ne? sauro 


We see that the transformations (1.19) are not consistent with the commutation rela- 
tions if O is a linear operator. But the fact that it is an anti-linear operator saves us: 
the factor of i sandwiched between operators on the right-hand side is conjugated and 
the equation becomes O[x;, p;|O~* = —thd;; which is happily consistent with (1.19). 
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Linear Algebra with Anti-Linear Operators 


Time reversal is described by an anti-linear operator ©. This means that we’re going 
to have to spend a little time understanding the properties of these unusual operators. 


We know that © acts on the Hilbert space H as (1.17). But how does it act on the 
dual Hilbert space of bras? Recall that, by definition, each element (¢| of the dual 
Hilbert space should be thought of as a linear map (¢| : H > C. For a linear operator 
A, this is sufficient to tell us how to think of A acting on the dual Hilbert space. The 
dual state (@|A is defined by 


((9| A) Id) = lAl) (1.20) 


This definition has the consequence that we can just drop the brackets and talk about 
(¢|Alw) since it doesn’t matter whether we interpret this as A acting on to the right 
or left. 


In contrast, things are more fiddly if we’re dealing with an anti-linear operator B. 
We would like to define (¢|B. The problem is that we want (¢|B to lie in the dual 
Hilbert space which, by definition, means that it must be a linear operator even if B 
is an anti-linear operator. But if we just repeat the definition (1.20) then it’s simple 
to check that (¢|B inherits anti-linear behaviour from B and so does not lie in the 
dual Hilbert space. To remedy this, we modify our definition of (¢|B for anti-linear 
operators to 


KoB) = KOBI (1.21) 


This means, in particular, that for an anti-linear operator we should never write (¢|B|w) 
because we get different answers depending on whether B acts on the ket to the right 
or on the bra to the left. This is, admittedly, fiddly. Ultimately the Dirac bra-ket 
notation is not so well suited to anti-linear operators. 


Our next task is to define the adjoint operators. Recall that for a linear operator A, 
the adjoint At is defined by the requirement 


(GAT) = lAo 
What do we do for an anti-linear operator B? The correct definition is now 


(Al(B'd)) = KIB) = WIBI) (1.22) 


This ensures that B? is also anti-linear. Finally, we say that an anti-linear operator B 
is anti-unitary if it also obeys 


BIiB=BBt=1 


MES 


Anti-Unitary Operators Conserve Probability 


We have already seen that time reversal should be anti-linear. It must also be anti- 
unitary. This will ensure that probabilities are conserved under time reversal. 'To see 
this, consider the states |¢') = O|¢d) and |Y = Olw). Then, using our definitions 
above, we have 


(o'ld’) = KANO) = KA = ly 


We see that the phase of the amplitude changes under time reversal, but the probability, 
which is |(¢|w)|?, remains unchanged. 


1.2.2 An Example: Spinless Particles 


So far, we’ve only described the properties required of the time reversal operator O. 
Now let’s look at some specific examples. We start with a single particle, governed by 
the Hamiltonian 

2 


p V 
2m ( ) 


To describe any operator, it’s sufficient to define how it acts on a basis of states. The 
time reversal operator is no different and, for the present example, it’s sensible to choose 
the basis of eigenstates |x}. Because © is anti-linear, it’s important that we pick some 
fixed choice of phase for each |x}. (The exact choice doesn’t matter; just as long as we 
make one.) Then we define the time reversal operator to be 


Olx) = |x) (1.23) 


If O were a linear operator, this definition would mean that it must be equal to the 
identity. But instead © is anti-linear and it’s action on states which differ by a phase 
from our choice of basis |x} is non-trivial 


Oa|x) = a*|x) 


In this case, the adjoint operator is simple Ot = ©. Indeed, it’s simple to see that 
©? = 1, as is required by unitarity. 


Let’s see what we can derive from this. First, we can expand a general state |y} as 


jw) = f Pe oix = f Pe eo) 
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where w(x) = (x|w) is the wavefunction in position variables. Time reversal acts as 


lu) = | Proveo) = | de yeo = f e e 


We learn that time reversal acts on the wavefunction as complex conjugation: T : 
p(x) = v*(x). But this is exactly what we first saw in (1.14) from looking at the 
Schrödinger equation. We can also specialise to momentum eigenstates |p). These can 
be written as 


p) = | dr e) 
Acting with time reversal, this becomes 


lp) = | dx Gc*[x)(x| = f dèa e] x] = |-p) 
which confirms our intuition that acting with time reversal on a state should leave 


positions invariant, but flip the momenta. 


Importantly, invariance under time reversal doesn’t lead to any degeneracy of the 
spectrum in this system. Instead, it’s not hard to show that one can always pick the 
phase of an energy eigenstate such that it is also an eigenstate of ©. Ultimately, this is 
because of the relation ©? = 1. (This statement will become clearer in the next section 
where we'll see a system that does exhibit a degeneracy.) 


We can tell this same story in terms of operators. These can be expanded in terms 
of eigenstates, so we have 


= jes 4b. x| ne oP ok je nO x) = å 
and 
p= [prio] > 0p0= | p pojp)(p|o = -P 


where, in each case, we’ve reverted to putting a hat on the operator to avoid confusion. 
We see that this reproduces our expectation (1.19). 


Before we proceed, it will be useful to discuss one last property that arises when 
V(x) = V(|x|) is a central potential. In this case, the orbital angular momentum 
L = x x p is also conserved. From (1.19), we know that L should be odd under time 
reversal, meaning 


OLO” = —L (1.24) 
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We can also see how it acts on states. For a central potential, the energy eigenstates 
can be written in polar coordinates as 


Unim(X) = Rulr)Yim(0, $) 


The radial wavefunction R, ;(r) can always be taken to be real. Meanwhile, the spherical 
harmonics take the form Y;(0,¢) = e’"?P’"(cos@) with P™ an associated Legendre 


polynomial. From their definition, we find that these obey 
Pram (%) = (~1)” Yni, -m(x) (1.25) 
Clearly this is consistent with O? = 1. 


1.2.3 Another Example: Spin 


Here we describe a second example that is both more subtle and more interesting: it is 
a particle carrying spin z. To highlight the physics, we can forget about the position 
degrees of freedom and focus solely on the spin . 


Spin provides another contribution to angular momentum. This means that the spin 
operator S should be odd under time reversal, just like the orbital angular momentum 
(1.24) 


ese7'=-Ss (1.26) 


For a spin- 5 particle, we have S = Lo with o the vector of Pauli matrices. The Hilbert 


space is just two-dimensional and we take the usual basis of eigenvectors of S,, chosen 


a) = C) 


so that S,|+) = +4/+). Our goal is to understand how the operator © acts on these 


with a specific phase 


states. We will simply state the correct form and then check that it does indeed 
reproduce (1.26). The action of time reversal is 


Ql+)=t/-) , S|=)=—4|+) (1.27) 
Let’s look at some properties of this. First, consider the action of O°, 


6|+) = @(i|-)) = —i9|-) = -|+) 
6|-) = @(-il+)) = 18+) = -|-) 
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We see that 
©?’ =-1 (1.28) 


This is in contrast to the action of time reversal for a particle without spin (1.23). We 
will see shortly the consequences of O°? = —1. 


Since there’s a lot of it’s floating around, let’s go slowly and use this as an opportunity 
to flesh out others properties of ©. From (1.21), the action of © on the bras is 


(O= 4 OS] 

Meanwhile, from (1.22), the adjoint operator Ot is defined as 
Ot]+)=—i|-) , O'|-) = il+) 

We see that Ot = —O which, given (1.28), ensures that © is anti-unitary. 


Now we can look at the action of © on the various spin operators. Expanding each 
in our chosen basis, and using the results above, we find 
)(-|+|-)(+] = 05,0 = -S, 
+)(+|-|-)(-| => 05,01 =—S, 
= -i|+)(-|+i]-)4| => 95,0'=-S, 


wn 


as required. 


Time Reversal for General Spin 


We can generalise this discussion to a general particle carrying general spin s. (The 
formulae below also work for any angular momentum). The Hilbert space now has 
dimension 2s + 1, and is spanned by the eigenstates of S, 


S.|m) = mh|m) m=-—S,...,8 


We again require that the spin operators transform as (1.26) under time reversal. We 
can rewrite this requirement as OS = —SO. When applied to the eigenstates of S, 
this tells us 


S,0|m) = —OS,|m) = —mhO|m) 


which is the statement that O|m) is an eigenstate of S, with eigenvalue —mh. But the 
eigenstates of S, are non-degenerate, so we must have 


O|m) = am|- m) 


for some choice of phase œm which, as the notation shows, can depend on m. 


a1) = 


There’s a clever trick for figuring out how a,, depends on m. Consider the raising 
and lowering spin operators S+ = Ss + iSy. The action of time reversal is 


QS.0' = O(S, +iS,)O6' = —S, +15, = -S4 (1.29) 


Now consider the action of S} on ©|m), 


SOM) = amS4|—m) = amħiy (s +m) (s = m + 1)|-m + 1) 


Alternatively, we can use (1.29) to write 


S Olm) = —9S_|—m) = -ñy (s + m)(s — m+ 1)O|m — 1) 
= —amiħy (s +m)(s -m + 1)|-m + 1) 


We learn that 
Am = -Am-1 


The simplest choice is a,, = (—1)™. Because m can be either integer or half-integer, 
we will write this as 


Olm) = i?"|—m) 


This agrees with our earlier results, (1.25) for orbital angular momentum and (1.27) 
for spin-5. For now, the most important lesson to take from this is 


e?=1 integer spin 
o? = -1 half-integer spin 


This result is quite deep. Ultimately it associated to the fact that spin-half particles 
transform in the double cover of the rotation group, so that states pick up a minus sign 
when rotated by 27. As we now show, it has consequence. 


1.2.4 Kramers Degeneracy 


It is not surprising that acting with time reversal twice brings us back to the same 
state. It is, however, surprising that sometimes we can return with a minus sign. As 
we have seen, this doesn’t happen for spinless particles, nor for particles with integer 
spin: in both of these situations we have O? = 1. However, when dealing with particles 
with half-integer spin, we instead have ©? = —1. 


Time reversal with ©? = 1 does not automatically lead to any further degeneracy of 
the spectrum. (We will, however, see a special case when we discuss the Stark effect in 
Section 4.1 where a degeneracy does arise.) In contrast, when ©? = —1, there is always 
a degeneracy. 
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To see this degeneracy, we argue by contradiction. Suppose that the spectrum is 
non-degenerate, so that there is a state such that 


Ol) = aly) 
for some phase a. Then acting twice, we have 
O*|b) = a* Ol) = lal?) = l) 
This means that a non-degenerate spectrum can only arise when O? = +1. 


In contrast, whenever we have a time-reversal system with ©? = —1, all energy 
eigenstates must come in degenerate pairs. This is known as Kramers degeneracy. 


For the simple spin system that we described in Section 1.2.3, the degeneracy is 
trivial: it is simply the statement that |+) and |—) have the same energy whenever 
the Hamiltonian is invariant under time reversal. If we want to split the energy levels, 
we need to add a term to the Hamiltonian like H = B-S which breaks time reversal. 
(Indeed, this ties in nicely with our classical discussion where we saw that the magnetic 


field breaks time reversal, changing as B > —B.) 


In more complicated systems, Kramer’s degeneracy can be a very powerful statement. 
For example, we know that electrons carry spin z. The degeneracy ensures that in any 
time reversal invariant system which involves an odd number of electrons, all energy 
levels are doubly degenerate. This simple statement plays an important role in the 
subject of topological insulators in condensed matter physics. 
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2. Approximation Methods 


Physicists have a dirty secret: we’re not very good at solving equations. More precisely, 
humans aren’t very good at solving equations. We know this because we have computers 
and they’re much better at solving things than we are. 


We usually do a good job of hiding this secret when teaching physics. In quantum 
physics we start with examples like the harmonic oscillator or the hydrogen atom and 
then proudly demonstrate how clever we all are by solving the Schrodinger equation 
exactly. But there are very very few examples where we can write down the solution in 
closed form. For the vast majority of problems, the answer is something complicated 
that isn’t captured by some simple mathematical formula. For these problems we need 
to develop different tools. 


You already met one of these tools in an earlier course: it’s called perturbation theory 
and it’s useful whenever the problem we want to solve is, in some sense, close to one 
that we’ve already solved. This works for a surprisingly large number of problems. 
Indeed, one of the arts of theoretical physics is making everything look like a coupled 
harmonic oscillator so that you can use perturbation theory. But there are also many 
problems for which perturbation theory fails dismally and we need to find another 
approach. In general, there’s no panacea, no universal solution to all problems in 
quantum mechanics. Instead, the best we can hope for is to build a collection of tools. 
Then, whenever we’re faced with a new problem we can root around in our toolbox, 
hoping to find a method that works. The purpose of this chapter is to stock up your 
toolbox. 


2.1 The Variational Method 


The variational method provides a simple way to place an upper bound on the ground 
state energy of any quantum system and is particularly useful when trying to demon- 
strate that bound states exist. In some cases, it can also be used to estimate higher 
energy levels too. 


2.1.1 An Upper Bound on the Ground State 


We start with a quantum system with Hamiltonian H. We will assume that H has a 
discrete spectrum 


Hj|n) Bal) n=0,1,... 


with the energy eigenvalues ordered such that En < En+1. The simplest application of 
the variational method places an upper bound on the value of the ground state energy 
Eo. 
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Theorem: Consider an arbitrary state |W). The expected value of the energy obeys 
the inequality 


(E) = WEHIY) = Eo 


Proof: The proposed claim is, hopefully, intuitive and the proof is straightforward. 
We expand |Y) = >, an{n) with So, |an|? = 1 to ensure that (27) = 1. Then 


(E) = > až An (m|H|n) = ` až, An Enómn 


n,m=0 n,m=0 


N jan? En = Eo X lanl? + > lanl?(En — Eo) = Eo 
n=0 n=0 n=0 


In the case of a non-degenerate ground state, we have equality only if ag = 1 which 
implies a, = 0 for all n £0. 


Now consider a family of states, |W(a)), depending on some number of parameters 
a;i. If we like, we can relax our assumption that the states are normalised and define 
A 
Hay (H(@)A|Y(@)) 
(b(a@)|v(a)) 


This is sometimes called the Rayleigh-Ritz quotient. We still have 


E(a) > Eọ for all a 
The most stringent bound on the ground state energy comes from the minimum value 
of E(a) over the range of a. This, of course, obeys 


OE 


0a; a=a* 


=0 


giving us the upper bound Eo < E(a,). This is the essence of the variational method. 


The variational method does not tell us how far above the ground state E(a,) lies. 
It would be much better if we could also get a lower bound for Eọ so that we can 
say for sure that ground state energy sits within a particular range. However, for 
particles moving in a general potential V(x), the only lower bound that is known is 
Eo > min V(x). Since we’re often interested in potentials like V(x) ~ —1/r, which 
have no lower bound this is not particularly useful. 


Despite these limitations, when used cleverly by choosing a set of states |(a)) 
which are likely to be fairly close to the ground state, the variational method can 
give remarkably accurate results. 


=125.= 


An Example: A Quartic Potential 


Consider a particle moving in one-dimension in a quartic potential. The Hamiltonian, 
written in units where everything is set to one, is 
d2 
H = -— +1 
dz? 
Unlike the harmonic oscillator, this problem does not a have simple solution. Nonethe- 
less, it is easy to solve numerically where one finds 


Eo & 1.06 


Let’s see how close we get with the variational 
method. We need to cook up a trial wavefunction +} 
which we think might look something like the true 
ground state. The potential is shown on the right 
and, on general grounds, the ground state wave- est 
function should have support where the potential is 


smallest; an example is shown in orange. All we need ET 05 00 08 io 
to do is write down a function which has vaguely this 
shape. We will take 


where the factor in front ensures that this wavefunction is normalised. You can check 
that this isn’t an eigenstate of the Hamiltonian. But it does have the expected crude 
features of the ground state: e.g. it goes up in the middle and has no nodes. (Indeed, 
it’s actually the ground state of the harmonic oscillator). The expected energy is 


3 
E(a) = Vs fa (a — a?a? + atje” = S+ TaZ 


The minimum value occurs at a? = 3, giving 
E(a,) =% 1.08 


We see that our guess does pretty well, getting within 2% of the true value. You can 
try other trial wavefunctions which have the same basic shape and see how they do. 
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How Accurate is the Variational Method? 


Formally, we can see why a clever application of the variational method will give a 
good estimate of the ground state energy. Suppose that the trial wavefunction which 
minimizes the energy differs from the true ground state by 


1 
V1l+e? 


where |) is a normalised state, orthogonal to the ground state, (0|¢) = 0, and e is 


[W(a)) = (10) + €]¢)) 


assumed to be small. Then our guess at the energy is 


E(a,) = [(0] 110) + €((O|H|4) + (@|H|0)) + (o1 H1)] 


1+é 
Importantly the terms linear in € vanish. This is because (¢|H|0) = Eo(¢|0) = 0. We 
can then expand the remaining terms as 


B(a,) = Eo + è ((4|H|¢) — Eo) + O(€?) 


This means that if the difference from the true ground state is O(e), then the difference 
from the ground state energy is O(e?). This is the reason that the variational method 
often does quite well. 


Nonetheless, one flaw with the variational method is that unless someone tells us 
the true answer, we have no way of telling how good our approximation is. Or, in the 
language above, we have no way of estimating the size of e. Despite this, we will see 
below that there are some useful things we can do with it. 


2.1.2 An Example: The Helium Atom 


One important application of quantum mechanics is to explain the structure of atoms. 
Here we will look at two simple approaches to understand an atom with two electrons. 
This atom is helium. 


The Hamiltonian for two electrons, each of charge —e, orbiting a nucleus of charge 
Ze is 
H- p? Ze 1 p? Ze 1 e? 1 


f f 2.1 
4Téo |X, — X2] en 


2m Anmegry, 2m Anegre 


For helium, Z = 2 but, for reasons that will become clear, we will leave it arbitrary 
and only set it to Z = 2 at the end of the calculation. 
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If we ignore the final term, then this Hamiltonian is easy to solve: it simply consists 
of two independent copies of the hydrogen atom. The eigenstates would be 


U(x, X2) = Uni Jama (x1 )Vnz l,m (x2) 


where Wnim(r) are the usual energy eigenstates of the hydrogen atom. We should 
remember that the electrons are fermions so we can’t put them in the same state. 
However, electrons also have a spin degree of freedom which we have neglected above. 
This means that two electrons can have the same spatial wavefunction as long as one 
is spin up and the other spin down. 


Ignoring the interaction term between electrons gives the energy 


1 1 
E=-7? (= + 5) Ry (2.2) 
ni ni 
where Ry is the Rydberg constant, given by 
4 
me 
Ry = =a & 13.6 eV 
d 327262 h? i 


Setting Z = 2 and nı = nə = 1, this very naive approach suggests that the ground 
state of helium has energy Eo = —8 Ry ~ —109eV. The true ground state of helium 
turns out to have energy 


Eo © —79.0 eV (2.3) 


Our task is to find a method to take into account the final, interaction term between 
electrons in (2.1) and so get closer to the true result (2.3) Here we try two alternatives. 


Perturbation Theory 


Our first approach is to treat the Coulomb energy between two electrons as a pertur- 
bation on the original problem. Before proceeding, there is a question that we should 
always ask in perturbation theory: what is the small, dimensionless parameter that 
ensures that the additional term is smaller than the original terms? 


For us, we need a reason to justify why the last term in the Hamiltonian (2.1) is likely 
to be smaller than the other two potential terms. All are due to the Coulomb force, so 
come with a factor of e?/47€ 9. But the interactions with the nucleus also come with a 
factor of Z. This is absent in the electron-electron interaction. This, then, is what we 
hang our hopes on: the perturbative expansion will be an expansion in 1/Z. Of course, 
ultimately we will set 1/Z = 1/2 which is not a terribly small number. This might give 
us concern that perturbation theory will not be very accurate for this problem. 
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We now place each electron in the usual hydrogen ground state ~10,9(x), adapted to 
general Z 


Ze —Zr/a 
W1,0,0(x) = — 3e ° (2.4) 


where do is the Bohr radius, defined as 


Aregh? 
ag = z 


x 5x107 m 
me 


To leading order, the shift of the ground state energy is given by the standard result 
of first order perturbation theory, 


AE = E- | bnda lVi o0(x1)llY10,0(x2)|? 


~ Ane [x1 — X2| 


We need to compute this integral. 


The trick is to pick the right coordinate system. 
We will work in spherical polar coordinates for both 
particles. However, we will choose the z axis for the 
second particle to lie along the direction xı set by the 
first particle. The advantage of this choice is that the 


angle 0 between the two particles coincides with the 


polar angle 62 for the second particle. In particular, the 
separation between the two particles particles can be m 


written as 


x1 — X| = (Ki — X2)? = yr? + r2 — 2rir2 cos 62 


In these coordinates, it is simple to do the integration over the angular variables for 
the first particle, and over @2 for the second. The shift in the energy then becomes 


8r2e? / Z3 2 
ARa = (5) fan ree 24r1/a0 [on a 


Areo \ Tag 
+1 
1 
x | d(cos 62) 


Figure 2: 


1 Jr? +72 — 2rire cos b2 
2 
Ine* / Z? (rı — r2)? — y (rı + r2)? 
L 5 dry rie 2Zr1/ao dry ree 2Zr2/ao V V 
€9 TAG T1712 
2 
Ine” / Z? : riS ta = i tT 
L > dry reo dry r2e7?7"2/% | | | | 
€0 TAQ T1792 
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Those modulus signs are a little odd, but easily dealt with. Because the integral is 
symmetric in rı and r2, the regime rı > r2 must give the same result as the regime 
rı < r2. We can then focus on one of these regimes — say rı > ra where |r; — rə| — 
|r, + r2| = —2r2 — and just double our result. We have 


8 2 Z3 2 co ee 
Ate TE (5) f dry rı a dro rs e72272/a0 
E 0 


3 
€0 2 


Tas 
sre? ( ZZN? re aoro aÊ 

= eR d 2 0 —4Zr2/ao 
la) fod (Se + az) 


5 Ze bz 
8 ATEN A 4 
Using first order perturbation, we find that the ground state energy of helium is 


5Z 
Eo ~ E + AE = (-22" + *) Ry = —74.8 eV 
This is much closer to the correct value of Ey ~ —79 eV. In fact, given that our 
perturbative expansion parameter is 1/Z = 1/2, it’s much better than we might have 
anticipated. 


The Variational Method 


We'll now try again, this time using the variational method. For our trial wavefunction 
we pick U(x), x2) = Y (x1 )Y (X2) where 

Ae 

w(x; a) = 4] earl (2.5) 

TAG 
This is almost the same as the hydrogen ground state (2.4) that we worked with above. 
The only difference is that we’ve replaced the atomic number Z with a general param- 
eter a that we will allow to vary. We can tell immediately that this approach must do 
at least as well at estimating the ground state energy because setting a = Z reproduces 
the results of first order perturbation theory. 


The expectation of the energy using our trial wavefunction is 
Bla) = | daidza ye) o) HUG )0G0) 


with H the differential operator given in (2.1). Now we have to evaluate all terms in 
the Hamiltonian afresh. However, there is trick we can use. We know that (2.5) is the 
ground state of the Hamiltonian 


p ae? 1 


~ Im Ategr 
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where we’ve replaced Z by a in the second term. With this observation, we write the 
helium Hamiltonian (2.1) as 


ee ee fia-a(e A -)+ 1 | 


0 ry T2 [x1 — X9| 


Written in this way, the expected energy becomes 


E(a) = —20? Ry + = 2e =2) jes a + | Pras oo 


[X1 — X9| 


Here, the first term comes from the fact that our trial wavefunction is the ground state 
of Ha with ground state energy given by (2.2). We still need to compute the integrals 
in the second and third term. But both of these are straightforward. The first is 


2 3 
pez |v(x)| = an far re 20r/ao — ie 
r Tag ao 


Meanwhile, the final integral is the same as we computed in our perturbative calcula- 
tion. It is 


[x1 = X9| 8ao 


| Prz WPI E:) 5a 


Putting this together, we have 
5 5 
E(a) = | —-2a0°+4(a-—Z)a+t+ rhs Ry 


This is minimized for a, = Z — 5/16. The minimum value of the energy is then 


5 \2 
E(a,) = -2 (z — Ž) Ry ~ —177.5 eV (2.6) 


We see that this is somewhat closer to the true value of Eo ~ —79.0eV. 


There’s one last bit of physics hidden in this calculation. The optimum trial wave- 
function that we ended up using was that of an electron orbiting a nucleus with charge 
(Z — 5/16)e, rather than charge Ze. This has a nice interpretation: the charge of the 
nucleus is screened by the presence of the other electron. 


2.1.3 Do Bound States Exist? 


There is one kind of question where variational methods can give a definitive answer. 
This is the question of the existence of bound states. 
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Consider a particle moving in a localised potential V(x), such that V(x) — 0 as 
x — co. A bound state is an energy eigenstate with Æ < 0. For some potentials, 
there exist an infinite number of bound states; the Coulomb potential V = 1/r in 
three dimensions is a familiar example. For other potentials there will be only a finite 
number. And for some potentials there will be none. How can we tell what properties 
a given potential has? 


Clearly the variational method can be used to prove the existence of a bound state. 
All we need to do is exhibit a trial wavefunction which has Æ < 0. This then ensures 
that the true ground state also has Ep < 0. 


An Example: The Hydrogen Anion 
A hydrogen anion H` consists of a single proton, with two electrons in its orbit. But 


does a bound state of two electrons and a proton exist? 


The Hamiltonian for H~ is the same as that for helium, (2.1), but now with Z = 1. 
This means that we can import all the calculations of the previous section. In particular, 
our variational method gives a minimum energy (2.6) which is negative when we set 
Z = 1. This tells us that a bound state of two electrons and a proton does indeed exist. 


An Example: The Yukawa Potential 


The Yukawa potential in three-dimensions takes the form 


(2.7) 


For A > 0, this is an attractive potential. Note that if we set A = 0, this coincides with 
the Coulomb force. However, for A 4 0 the Yukawa force drops off much more quickly. 


The Yukawa potential arises in a number of different places in physics. Here are two 
examples: 


e In a metal, electric charge is screened. This was described in Section 7.7 of the 
lecture notes on Electromagnetism. This causes the Coulomb potential to be 
replaced by the Yukawa potential. 


e The strong nuclear force between a proton and a neutron is complicated. However, 
at suitably large distances it is well approximated by the Yukawa potential, with 
r the relative separation of the proton and neutron. Indeed, this is the context in 
which Yukawa first suggested his potential. Thus the question of whether (2.7) 
admits a bound state is the question of whether a proton and neutron can bind 
together. 
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A spoiler: the hydrogen atom has stable isotope known as deuterium. Its nu- 
cleus, known as the deuteron, consists of a proton and neutron. Thus, experiment 
tells us that a bound state must exist. We’d like to understand this theoretically, 
if only to be sure that the experiments aren’t wrong! 


The Hamiltonian is 
H = -—-— V 
y +V(r) 


In the context of deuterium, r is the distance between the proton and neutron so m 
should really be interpreted as the reduced mass m = MpmMn/ (Mp + Mn) ~ Mp/2. We 
will work with a familiar trial wavefunction, 


This is the ground state of the hydrogen atom. The factor in front ensures that the 
wavefunction is normalised: f dx |w|? = 1. A short calculation shows that the expected 
energy is 

hea? 4 Aa? 


He 2m (A+ 2a)? 


It’s easy to check that there is a value of a for which E(a) < 0 whenever 

Am 

R 

This guarantees that the Yukawa potential has a bound state when the parameters lie 


À< 


within this regime. We cannot, however, infer the converse: this method doesn’t tell 
us whether there is a bound state when À > Am/h?. 


It turns out that for A suitably large, bound states do cease to exist. The simple 
variational method above gets this qualitative bit of physics right, but it does not do 
so well in estimating the bound. Numerical results tell us that there should be a bound 
state whenever \ < 2.4Am/ħ. 


Bound States and The Virial Theorem 


There is a connection between these ideas and the virial theorem. Let’s first remind 
ourselves what the virial theorem is this context. Suppose that we have a particle in d 
dimensions, moving in the potential 


V(x) = Ar” (2.8) 


This means that the potential scales as V(Ax) = A"V (x). We will assume that there 
is a normalised ground state with wavefunction W(x). 
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The ground state energy is 
a, We 2 2— 
Ey = f dx zp Vr + V lolx) = Tho + (V)o 


Now consider the trial wavefunction y(x) = a4/?w9(ax), where the prefactor ensures 
that w(x) continues to be normalised. From the scaling property of the potential (2.8), 
it is simple to show that 


E(a) =a (T\p +a” (V}o 


The minimum of F(a) is at 


But this minimum must sit at œ = 1 since, by construction, this is the true ground 
state. We learn that for the homogeneous potentials (2.8), we have 


XT) = nlV)o (2.9) 
This is the virial theorem. 
Let’s now apply this to our question of bound states. Here are some examples: 


e V ~ —1/r: This is the Coulomb potential. The virial theorem tells us that 
Eo = (To + V)o = —(T)o < 0. In other words, we proved what we already 
know: the Coulomb potential has bound states. 


There’s a subtlety here. Nowhere in our argument of the virial theorem did we 
state that the potential (2.8) has A < 0. Our conclusion above would seem to 
hold for A > 0, yet this is clearly wrong: the repulsive potential V ~ +1/r has 
no bound states. What did we miss? Well, we assumed right at the beginning of 
the argument that the ground state wp was normalisable. For repulsive potentials 
like V ~ 1/r this is not true: all states are asymptotically plane waves of the 
form e’**, The virial theorem is not valid for repulsive potentials of this kind. 

e V ~ —1/r*: Now the virial theorem tells us that Ey = ¿(Tọ > 0. This is 
actually a contradiction! In a potential like V ~ 1/r?, any state with Æ > 0 is 
non-normalisable since it mixes with the asymptotic plane waves. It must be that 
this potential has no localised states. 
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This result might seem surprising. Any potential V ~ —r” with n < —3 
descends steeply at the origin and you might think that this makes it efficient 
at trapping particles there. The trouble is that it is too efficient. The kinetic 
energy of the particle is not sufficient to hold it up at some finite distance, and 
the particle falls towards the origin. Such potentials have no bound states. 


Bound States in One Dimension 


There is an exact and rather pretty result V(x) 
that holds for particles moving in one-dimension. 


Consider a particle moving in a potential V (x) x 
such that V(x) = 0 for |z| > L. However, when 

|x| < L, the potential can do anything you like: 

it can be positive or negative, oscillate wildly or Figure 3: Does a bound state exist? 
behave very calmly. 


Theorem: A bound state exists whenever f dæ V(x) < 0. In other words, a bound 
state exists whenever the potential is ”mostly attractive”. 


Proof: We use the Gaussian variational ansatz 


Then we find 


h? ‘al 2 
E(a) = I + ‘eal da. Vine =?" 


where the h?a/4m term comes from the kinetic energy. The trick is to look at the 
function 


Ela) Wa | 1 es bf pe 
en de tL eve) 


This is a continuous function of a. In the limit a > 0, we have 
E 1 > 
casey) > — i dx V(x) 
a VT J- 
If fdz V(x) < 0 then limano E(a)/Va < 0 and, by continuity, there must be some 
small a > 0 for which E(a) < 0. This ensures that a bound state exists. 


Once again, the converse to this statement does not hold. There are potentials with 
J dx V(x) > 0 which do admit bound states. 
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You may wonder if we can extend this result to higher dimensions. It turns out that 
there is an analogous statement in two dimensions’. However, in three dimensions or 
higher there is no such statement. In that case, if the potential is suitably shallow there 
are no bound states. 


2.1.4 An Upper Bound on Excited States 


So far, we’ve focussed only on approximating the energy of the ground state. Can we 
also use the variational method to give a bound on the energy of excited states? 


This is rather more tricky. We can make progress if we know the ground state |0) 
exactly. In this case, we construct a trial wavefunction |w(a)) that is orthogonal to the 
ground state, 


((@)|0) =0 for alla (2.10) 


Now we can simply rerun our arguments of Section 2.1.1. The minimum of E(a) = 
(W(a)|H|v(a@)) provides an upper bound on the energy F of the first excited state. 


In principle, we could then repeat this argument. Working with a trial wavefunction 
that is orthogonal to both |0) and |1) will provide an upper bound on the energy Es of 
the second excited state. 


In practice, this approach is not much use. Usually, if we’re working with the varia- 
tional method then it’s because we don’t have an exact expression for the ground state, 
making it difficult to construct a trial wavefunction obeying (2.10). If all we have is 
an approximation to the ground state, this is no good at all in providing a bound for 
excited states. 


There is, however, one situation where we can make progress: this is if our Hamilto- 
nian has some symmetry or, equivalently, some other conserved quantity. If we know 
the quantum number of the ground state under this symmetry then we can guarantee 
(2.10) by constructing our trial wavefunction to have a different quantum number. 


An Example: Parity and the Quartic Potential 


For a simple example of this, let’s return to the quartic potential of Section 2.1.1. The 
Hamiltonian is 
d2 
H =- +2" 
dx? 


‘More details can be found in the paper by Barry Simon, “The bound state of weakly coupled 
Schrödinger operators in one and two dimensions”, Ann. Phys. 97, 2 (1976), which you can download 


here. 
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This Hamiltonian is invariant under parity, mapping x — —2. The true ground state 
must be even under parity. We can therefore construct a class of trial wavefunctions 
for the first excited state which are odd under parity. An obvious choice is 


4a3 1/4 
(aa) = (=) pean 


T 


Churning through some algebra, one finds that the minimum energy using this wave- 
function is 


E(a,) = 3.85 
The true value is FE) œ% 3.80. 


2.2 WKB 


The WKB approximation is a method for solving the one-dimensional Schrodinger 
equation. The approximation is valid in situations where the potential changes slowly 
compared to the de Broglie wavelength \ = 27h/p of the particle. The basic idea is that 
the wavefunction will be approximately that of a free particle, but with an amplitude 
and phase that vary to compensate the changes in the potential. 


The method is named after the physicists Wentzel, Kramers and Brillouin. It is 
sometimes called the WKBJ approximation, with Harold Jeffreys’ name tagged on 
the end to recognise the fact that he discovered before any of the other three. The 
main applications of the method are in estimating bound state energies and computing 
tunnelling rates. 


2.2.1 The Semi-Classical Expansion 


Before we jump into the quantum problem, let’s build some classical intuition. Suppose 
that a one-dimensional potential V(x) takes the form shown on the left-hand figure 
below. A classical particle with energy E will oscillate backwards and forwards, with 
momentum given by 


1/2 
p(x) = ħk(£) = (2m (i= V(z)) ) (2.11) 
Clearly, the particle only exists in the regions where Æ > V(x). At the points where 


E = V(x), it turns around and goes back the other way. 


— 37 = 


V(x) V(x), W(x) 


(tess cs eenea a E ee ee ee is 
(J x 
we y 
Figure 4: The classical state. Figure 5: The quantum state. 


Now let’s think about a quantum particle. Suppose that the potential varies slowly. 
This means that if we zoom into some part of the figure then the potential will be 
approximately constant. We may imagine that in this part of the potential, we can 
approximate the wavefunction by the plane wave y(x) ~ e®)#, However, the wave- 
function also spreads beyond the region where the classical particle can reach. Here 
E < V(x) and so, taken at face value, (2.11) tells us that p(x) becomes purely imagi- 
nary. This means that the ansatz y(x) ~ e'?)* will lead to an exponentially decaying 
tail of the wavefunction (at least if we pick the minus sign correctly). But that’s exactly 
what we expect the wavefunction to do in this region. 


These ideas form the basis of the WKB approximation. Our goal now is to place 
them on a more systematic footing. To this end, consider the one-dimensional time- 
independent Schrodinger equation 

h2 dèy) 


omdr? + V(x) = Ey 


It will prove useful to write this as 


dy _ 2m 
dr? R 


(E -Vz)) y =0 
Motivated by our discussion above, we will look for solutions of the form 
yla) = ewoh 


Plugging this ansatz into the Schrödinger equation leaves us with the differential equa- 
tion 


WwW (Wy? 
ih ( u9 + p(x)* =0 (2.12) 


where we the classical momentum p(x) defined in (2.11) makes an appearance. 
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The plane wave solutions arise when W (x) = kz, in which case the second derivative 
in (2.12) vanishes. Here we'll look for solutions where this second derivative is merely 
small, meaning 
2 


W 
dx? 


h (2.13) 


aw 
dx 


We refer to this as the semi-classical limit. 


Roughly speaking, (2.13) can be thought of as the A > 0 limit. Indeed, mathemati- 
cally, it makes sense to attempt to solve (2.12) using a power series in A. As physicists, 
this should makes us squirm a little as A is dimensionful, and so can’t be “small”. But 
we'll first solve the problem and then get a better understanding of when the solution 
is valid. For these purposes, we treat p(x) as the background potential which we will 
take to be O(h°). We expand our solution as 


W(x) = Wo(x) + AW, (x) + R° W(x) +... 
Plugging this ansatz into (2.12) gives 
|- Wo(x)? + p(x)? +h jiw (x) - 2W4(x)Wi(2)| + O(h’) =0 


We see that we can now hope to solve these equations order by order in h. The first is 
straightforward, 


Wie) = plz) => Wilz)= +f dx’ p(x’) 


This is actually something that arises also in classical mechanics: it is the Hamilton- 
Jacobi function. More details can be found in Sections 4.7 and 4.8 of the lecture notes 
on Classical Dynamics. 
At O(h), we have 
i Wgl)  ip'(x) i 
Wilt) = im => Wi(r)=-=1 


for some constant c. Putting these together gives us the WKB approximation to the 


wavefunction, 


‘i es — ae (= f “te p’) (2.14) 


The probability of finding a particle at x is, of course, |q)(x)|? ~ 1/p(x). This is intu- 
itive: the probability of finding a particle in some region point should be proportional 
to how long it spends there which, in turn, is inversely proportional to its momentum. 
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Validity of WKB 
Before moving on, let’s try to get a better feeling for the validity of the WKB approx- 


imation. To leading order, our requirement (2.13) reads 
1 dà 
2r dx 
where A = 27h /p is the de Broglie wavelength. This is the statement that the de Broglie 
wavelength of the particle does not change considerably over distances comparable to 


d 
nE < po => <1 


its wavelength. 


Alternatively, we can phrase this as a condition on the potential. Using (2.11), we 
have 


2 
dV oy, POI 


a) dx 2m 


which roughly says that the change of the potential energy over a de Broglie wavelength 
should be much less than the kinetic energy (with the factor of 4r giving an order of 
magnitude in leniency.) 


The Need for a Matching Condition 


Let’s take a slowly varying potential. We want to find a solution to the Schrodinger 
equation with some energy E. 


The WKB approximation does provides a solution in regions where E >> V(x) and, 
correspondingly, p(x) is real. This is the case in the middle of the potential, where 
the wavefunction oscillates. The WKB approximation also provides a solutions when 
E < V(x), where p(x) is imaginary. This is the case to the far left and far right, where 
the wavefunction suffers either exponential decay or growth 


w(x) & zvo- p P (= T dx! ./2m(V(a') — PJ) 


The choice of + is typically fixed by normalisability requirements. 


But what happens in the region near Æ = V(x)? Here the WKB approximation is 
never valid and the putative wavefunction (2.14) diverges because p(x) = 0. What to 
do? 


The point x9 where p(x) = 0 is the classical turning point. The key idea that 
makes the WKB approximation work is matching. This means that we use the WKB 
approximation where it is valid. But in the neighbourhood of any turning point we will 
instead find a different solution. This will then be matched onto our WKB solution. 
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So what is the Schrodinger equation that we want to solve in the vicinity of 79? We 
expand the potential energy, keeping only the linear term 


V(x) x E+C(@—20)+... 


The Schrödinger equation is then 


We will solve this Schrödinger equation exactly, and then match this solution to the 
WKB wavefunction (2.14) to the left and right. 


2.2.2 A Linear Potential and the Airy Function 


The problem of the Schrödinger equation for a linear potential is interesting in its 
own right. For example, this describes a particle in a constant gravitational field 
with x the distance above the Earth. (In this case, we would place a hard wall — 
corresponding to the surface of the Earth — at x = 0 by requiring that (0) = 0.) 
Another example involves quarkonium, a bound state of a heavy quark and anti-quark. 
Due to confinement of QCD, these experience a linearly growing potential between 
them. 


For a linear potential V(x) = Cx, with C constant, the Schrödinger equation is 
x 


Before proceeding, it’s best rescale our variables to absorb all the factors floating 
around. Define the dimensionless position 


u= (Fr i (2 — E/C) (2.17) 


Then the Schrödinger equation (2.16) becomes 
dy 


du? 


— uy =0 (2.18) 


This is known as the Airy equation. The solution is the Airy function, y(u) = Ai(u), 
which is defined by the somewhat strange looking integral 


1 [° t 
Ai(u) = J dt cos (5 EE ut) 
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Ai(u) 


Figure 6: The Airy function. 


To check this, note that 


a Ai(u) = f uea ) l t 
du2 u uj = T Jo U) COS 3 u 
= f a li ae t 
= To T aa 


The lower limit of the integral clearly vanishes. The upper limit is more tricky. Heuris- 
tically, it vanishes as sin t? oscillates more and more quickly as t + oo. More care is 
needed to make a rigorous argument. 


A plot of the Airy function is shown in Figure 6. It has the nice property that it 
oscillates for u < 0, but decays exponentially for u > 0. Indeed, it can be shown that 
the asymptotic behaviour is given by 


me 2, 
Ai(u) ~ l (=) exp (52) u> 0 (2.19) 


and 


1 


Ty TU 


1/2 

2 
Ai(u) ~ ( ) Cos (fez + z) u o0 (2.20) 
This kind of behaviour is what we would expect physically. Tracing through our defini- 
tions above, the region u < 0 corresponds to E > V(x) and the wavefunction oscillates. 
Meanwhile, u > 0 corresponds to E < V(x) and the wavefunction dies quickly. 


The Airy equation (2.18) is a second order differential equation and so must have a 
second solution. This is known as Bi(u). It has the property that it diverges as x — oo, 
so does not qualify as a good wavefunction in our problem. 
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An Aside: Quarkonium 


Take a quark and anti-quark and separate them. The quarks generate a field which is 
associated to the strong nuclear force and is sometimes called the chromoelectric field. 
Just like in Maxwell theory, this field gives rise to a force between the two quarks. 


Classically the force between two quarks scales as 
V ~ 1/r, just like the Coulomb force. However, quan- 
tum fluctuations of the chromoelectric field dramati- 
cally change this behaviour and the chromoelectric field 
forms a collimated flux tube linking the quarks. A nu- 
merical simulation of this effect is shown on the right?. 
The upshot of this is that the potential between two 
quarks changes from being V ~ 1/r to the form 


V=Cr (2.21) Figure 7: 


This means that, in sharp contrast to other forces, it gets harder and harder to separate 
quarks. This behaviour is known as confinement. The coefficient C' is referred to as 
the string tension. 


We won’t explain here why the potential takes the linear form (2.21). (In fact, you 
won't find a simple explanation of that anywhere! It’s closely related to the Clay 
millenium prize problem on Yang-Mills theory. A large part of the lecture notes on 
Gauge Theory is devoted to an intuitive understanding of how confinement comes 
about.) Instead we’ll just look at the spectrum of states that arises when two quarks 
experience a linear potential. These states are called quarkonium. The Schrodinger 


he (2 d (=) tg) + Cru(r) = Ev(r) 


2m \ r?2 dr r2 


There is an interesting story about how this spectrum depends on the angular momen- 


equation is 


tum / but, for now, we look at the / = 0 sector. Defining y = ry and the dimensionless 
coordinate u = (2mC/h?)'/3(r — E/C) as in (2.17), we see that this once again reduces 
to the Airy equation, with solutions given by x(u) = Ai(u) 


So far there is no quantisation of the allowed energy E. This comes from the require- 
ment that y(r = 0). In other words, 


l Jm 1/3 


?This is part of a set of animations of QCD, the theory of the strong force. You can see them at 


Derek Leinweber’s webpage. They’re pretty! 
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The zeros of the Airy function Ai(y) can be computed numerically. The first few occur 
at y = —y,, with 


y, = 2.34, 4.09, 5.52, 6.79, 7.94, 9.02, ... 


The first few energy levels are then E = (h?C?/2m)'/3y,. 


An Application: Matching the WKB Solution 


For us, the main purpose in introducing the Airy function is to put it to work in the 
WKB approximation. The Airy function solves the Schrödinger equation (2.15) in the 
vicinity of the turning point zo where, comparing to (2.16), we see that we should set 
zo = E/C. The asymptotic behaviour (2.19) and (2.20) is exactly what we need to 
match onto the WKB solution (2.14). 


Let’s see how this works. First consider u < 0, corresponding to x < 2%. Here 
E > V(x) and we have the oscillatory solution. We want to rewrite this in terms of our 
original variables. In this region, V(x) ~ E + C(x — xo), so we can justifiably replace 


u= (ey e-a (BS) E-o) 


where we’ve used our definition of p(x) given in (2.11). In these variables, the asymp- 
totic form of the Airy function (2.20) is given by 


omon YA ry pe -ai 
Ai(z) ~ (on) COs G [ dix! \/2m(E Vey +4) (2.22) 


This takes the same oscillatory form as the WKB solution (2.14). The two solutions 
can be patched together simply by picking an appropriate normalisation factor and 
phase for the WKB solution. 


Similarly, in the region u > 0, the exponentially decaying form of the Airy function 
(2.19) can be written as 


hn. (2mCh)*/3 ve LF a’ oa Vo 
Ai(x) (ee | exp ( an dx! \/2m((V (2) zJ) (2.23) 


This too has the same form as the exponentially decaying WKB solution (2.14). 


This, then, is how we piece together solutions. In regions where E > V(x), the 
WKB approximation gives oscillating solutions. In regimes where E < V(z), it gives 
exponentially decaying solutions. The Airy function interpolates between these two 
regimes. The following examples describes this method in practice. 
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2.2.3 Bound State Spectrum 


As an example of this matching, let’s return to the po- V(x) 
tential shown on the right. Our goal is to compute the 
spectrum of bound states. We first split the potential 
into three regions where the WKB approximation can 
be trusted: 


~ T 


Region 1 ra 
Region2 a«&«gr«& b 


Figure 8: 
Region 3 x>b i 


We’ll start in the left-most Region 1. Here the WKB 
approximation tells us that the solution dies exponentially as 

2 exp | — l dx! \/2m(V(a’) — E) 
Im(V (a) — By) i 


As we approach x = a, the potential takes the linear form V(x) ~ E + V'(a)(x — a) 
and this coincides with the asymptotic form (2.19) of the Airy function Ai(—u). We 


pils) & 


then follow this Airy function through to Region 2 where the asymptotic form (2.22) 
tells us that we have 


2A I 7” 24 ; T 
poale) = am(V(2) EA cos GJS dx! \/2m(E — V (x")) — z) (2.24) 


a 


Note the minus sign in the phase shift —7/4. This arises because we’re working with 
Ai(—u). The Airy function takes this form close to x = a where V(x) is linear. But, as 
we saw above, we can now extend this solution throughout Region 2 where it coincides 
with the WKB approximation. 


We now repeat this procedure to match Regions 2 an 3. When x > b, the WKB 
approximation tells us that the wavefunction is 
A! x 
W3(a) = mV) EJA exp (- [ dx! \/2m(V(x) — 5) 
Matching to the Airy function across the turning point x = b, we have 


2A’ oe cre ; T 
palz) & mV (a) - By) cos GJ dx! \/2m(E — V(x") + z) (2.25) 


We’re left with two expressions (2.24) and (2.25) for the wavefunction in Region 2. 
Clearly these must agree. Equating the two tells us that |A| = |A’|, but they may differ 
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by a sign, since this can be compensated by the cos function. Insisting that the two 


cos functions agree, up to sign, gives us the condition 
mo 1 


f V2m(E — V (x')) — r zf V2m(E — V (x')) + z + nm 


a 


for some integer n. Rearranging gives 


i: dx! \/2m(E — V(x") = (r + 5) hr (2.26) 


a 


To complete this expression, we should recall what we mean by a and b. For a given 
energy E, these are the extreme values of the classical trajectory where p(x) = 0. In 
other words, we can write a = pin and b = Xmax. If we write our final expression in 
terms of the momentum p(x), it takes the simple form 


+ ec = ( $ 5) Fie (2.27) 


Tmin 


An Example: The Harmonic Oscillator 


To illustrate this, let’s look at an example that we all known and love: the harmonic 
oscillator with V(x) = m?w?x?. The quantisation condition (2.26) becomes 


„max 


. 2nE 1 1 
/ de Bn BMP = “OEE = (n+ 5) ha => B= (n+5) hw 


mw 2 


Tmin 
This, of course, is the exact spectrum of the harmonic oscillator. I should confess that 
this is something of a fluke. In general, we will not get the exact answer. For most 
potentials, the accuracy of the answer improves as n increases. This is because the high 
n are high energy states. These have large momentum and, hence, small de Broglie 
wavelength, which is where the WKB approximation works best. 


2.2.4 Bohr-Sommerfeld Quantisation 


The WKB approximation underlies an important piece of history from the pre-Schrödinger 
era of quantum mechanics. We can rewrite the quantisation condition (2.27) as 


fa paes (r + 5) orh 


where $ means that we take a closed path in phase space which, in this one-dimensional 
example, is from 2pjn tO Zmax and back again. This gives the extra factor of 2 on the 
right-hand side. You may recognise the left-hand-side as the adiabatic invariant from 
the Classical Dynamics lectures. This is a sensible object to quantise as it doesn’t 
change if we slowly vary the parameters of the system. 
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In the old days of quantum mechanics, Bohr and Sommerfeld introduced an ad-hoc 
method of quantisation. They suggested that one should impose the condition 


f dx p(x) = 2nnh 


with n an integer. They didn’t include the factor of 1/2. They made this guess because 
it turns out to correctly describe the spectrum of the hydrogen atom. This too is 
something of a fluke! But it was an important fluke that laid the groundwork for 
the full development of quantum mechanics. The WKB approximation provides an 
a-posteriori justification of the Bohr-Sommerfeld quantisation rule, laced with some 
irony: they guessed the wrong approximate quantisation rule which, for the system 
they were interested in, just happened to give the correct answer! 


More generally, “Bohr-Sommerfeld quantisation” means packaging up a 2d-dimensional 
phase space of the system into small parcels of volume (27h)¢ and assigning a quan- 
tum state to each. It is, at best, a crude approximation to the correct quantisation 
treatment. 


2.2.5 Tunnelling out of a Trap 


For our final application of the WKB approximation, we look at the problem of tun- 
nelling out of a trap. This kind of problem was first introduced by Gammow as a model 
for alpa decay. 


Consider the potential shown in the figure, with V(x) , Wx) 
functional form 


vaya (Me ESR eS 
| +a/z r>R E a 
a 


We'll think of this as a one-dimensional problem; it is 


not difficult to generalise to to a three-dimensions. Here 

R is the be thought of as the size of the nucleus; Vo is Figure 9: 

models the nuclear binding energy, while outside the 

nucleus the particle feels a Coulomb repulsion. If we take the particle to have charge q 
(for an alpha particle, this is q = 2e) and the nucleus that remains to have charge Ze, 
we should have 


ga (2.28) 
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Any state with E < 0 is bound and cannot leave the trap. (These are shown in green 
in the figure.) But those with 0 < E < a/R are bound only classically; quantum 
mechanics allows them to tunnel through the barrier and escape to infinity. We would 
like to calculate the rate at which this happens. 


In the region x < R, the wavefunction has the form 


hk? 
2m 


Winside (2) = Ag with E = 


After tunnelling, the particle emerges at distance x = x, defined by E = a/z,. For 
x > £, the wavefunction again oscillates, with a form given by the WKB approximation 
(2.14), However, the amplitude of this wavefunction differs from the value A. The ratio 
of these two amplitudes determines the tunnelling rate. 


To compute this, we patch the two wavefunctions together using the exponentially 
decaying WKB solution in the region R < x < x,. This gives 


O(a.) = Y(R) e” 


where the exponent is given by the integral 


S= 1. dix! /2m (< = E) (2.29) 


This integral is particularly simple to compute in the limit R — 0 where it is given by 


where, in the second equality, we’ve set the energy of the particle equal to its classical 
kinetic energy: E = imo’. 


The transmission probability T is then given by 


T= We)? eo (2.30) 


~ IY(R)}? 


This already contains some interesting information. In particular, recalling the defini- 
tion of a in (2.28), we see that the larger the charge of the nucleus, the less likely the 
decay. 
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Usually we discuss the decay of atomic nuclei in terms of lifetimes. We can compute 
this by adding some simple (semi)-classical ideas to the above analysis. Inside the trap, 
the particle is bouncing backwards and forwards with velocity 


2(E + Vo) 
m 


Vo = 


This means that the particle hits the barrier with frequency v = vo/R. The decay rate 


—28/h and the lifetime is 


is then T = ve 
2 Rm e2S/ň 
2(E + Vo) 


We didn’t really treat the dependence on R correctly above. We set R = 0 when 
evaluating the exponent in (2.29), but retained it in the pre-factor. A better treatment 
does not change the qualitative results. 


One Last Thing... 
It is not difficult to extend this to a general potential V (x) as 


V(x) 
shown in the figure. In all cases, the transmission probability 
has an exponential fall-off of the form T ~ e~75/" where S 


is given by 


S= f dx! \/2m(V (£) — E) (2.31) 


XO 
where the positions x9 and xı are the classical values where Figure 10: 
V(x) = E, so that the integral is performed only over the forbidden region of the 
potential. 


There is a lovely interpretation of this result that has its heart in the path integral 
formulation of quantum mechanics. Consider the a classical system with the potential 
—V (x) rather than +V (x). In other words, we turn the potential upside down. The 
action for such a system is 


S{x(¢)] = f ‘dt snd? + V(2) 


D 
In this auxiliary system, there is a classical solution, x(t) which bounces between the 
two turning points, so £alto) = £o and zaltı) = xı. It turns out that the exponent 
(2.31) is precisely the value of the action evaluated on this solution 


S = S|zalt)]| 


This result essentially follows from the discussion of Hamilton-Jacobi theory in the 
Classical Dynamics lecture notes. 
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2.3 Changing Hamiltonians, Fast and Slow 


You learned in the previous course how to set-up perturbation theory when the Hamil- 
tonian H(t) changes with time. There are, however, two extreme situations where life 
is somewhat easier. This is when the changes to the Hamiltonian are either very fast, 
or very slow. 


2.3.1 The Sudden Approximation 


We start with the fast case. We consider the situation where the system starts with 
some Hamiltonian Hp, but then very quickly changes to another Hamiltonian H. This 
occurs over a small timescale 7. 


Of course “very quickly” is relative. We require that the time 7 is much smaller than 
any characteristic time scale of the original system. These time scales are set by the 
energy splitting, so we must have 


< pa 
eS NE 
If these conditions are obeyed, the physics is very intuitive. The system originally sits 
in some state |W). But the change happens so quickly that the state does not have a 
chance to respond. After time 7, the system still sits in the same state |W). The only 
difference is that the time dynamics is now governed by H rather than Ho. 


An Example: Tritium 


Tritium, °H, is an isotope of hydrogen whose nucleus contains a single proton and two 
neutrons. It is unstable with a half-life of around 12 years. It suffers beta decay to 
helium, emitting an electron and anti-neutrino in the process 


oH 3Het +e + De 


The electron is emitted with a fairly wide range of energies, whose mean is E ~ 5.6 keV. 
Since the mass of the electron is mc? ~ 511 keV, the electron departs with a speed 
given by E = imo? (we could use the relativistic formula E = myc? but it doesn’t 
affect the answer too much). This is v ~ 0.15c. The time taken to leave the atom is 
then T ~% ao/v ~ 10719 s where ap ~ 5 x 1071! m is the Bohr radius. 


We'll initially take the electron in the tritium atom to sit in its ground state. The 
first excited state has energy difference AE = 2 Eo ~ 10 eV, corresponding to a time 
scale h/AE ~ 6.5 x 107!" s. We therefore find 7 < h/AE by almost two orders of 
magnitude. This justifies our use of the sudden approximation. 
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The electron ground state of the tritium atom is the same as that of hydrogen, namely 


VA 
vo = L E 2r/a0 with Z = 1 
Tag 
After the beta decay, the electron remains in this same state, but this is no longer an 
energy eigenstate. Indeed, the ground state of helium takes the same functional form, 
but with Z = 2. The probability that the electron sits in the ground state of helium is 
given by the overlap 

2 g3 

P= [aes we(x; Z = 1) y(x; Z = 2) 


We see that 70% of the time the electron remains in the ground state. The rest of the 
time it sits in some excited state, and subsequently decays down to the ground state. 


2.3.2 An Example: Quantum Quench of a Harmonic Oscillator 


There are a number of experimental situation where one deliberately make a rapid 
change to the Hamiltonian. This forces the system away from equilibrium, with the 
goal of opening a window on interesting dynamics. In this situation, the process of the 
sudden change of the Hamiltonian is called a quantum quench. 


As usual, the harmonic oscillator provides a particularly simple example. Suppose 
that we start with the Hamiltonian 


2 
D 1 1 
Ho = pyr + or = hwo (aha + 5) 
where 
1 ; 
ao = (mwox + ip) 
2Mwo 


Then, on a time scale T < h/wo, we change the frequency of the oscillator so that the 
Hamiltonian becomes 


2 
p Leeg P 1 
H=—+-+ = x 
on gu T hw (ala 5 


Clearly the wavefunctions for energy eigenstates are closely related since the change in 
frequency can be compensated by rescaling x. However, here we would like to answer 
different questions: if we originally sit in the ground state of Ho, which state of H do 
we end up in? 
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A little bit of algebra shows that we can write the new annihilation operator as 


1 a) 1 Wo Wo 1 Ww Wo\ + 
a= MUX +t = T a T — a 
2mw Br 9 Wo wj °? Wo re aes 


Let’s denote the ground state of Ho by |Ø). It obeys ao|@) = 0. In terms of our new 


creation and annihilation operators, this state satisfies (w + wo)a|@) = (w — wo)a"|0). 
Expanded in terms of the eigenstates |n), n = 0,1,... of H, we find that it involves 
the whole slew of parity-even excited states 


= 2n+1 /w-— wo 
10) 2 |2n) wi Q2n+2 On 42 (=) Q2 


We can also address more detailed questions about the dynamics. Suppose that the 
quench takes place at time t = 0. Working in the Heisenberg picture, we know that 


h hmw 
2 — 2 = 
OOM = z md MO 
The position operator now evolves, governed by the new Hamiltonian H, 


p(0) 


t) = x(0 t — si t 

x(t) = (0) cos(wt) + Z sin(wt) 

With a little bit of algebra we find that, for tə > tı, the positions are correlated as 

(w? — we) cos(w(te + t1)) + (w — wo)? cos(w(tz — t1)) 


(@|x(ta)x(t,)|0) = L feet) 4 
2mw 2wwo 


The first term is the evolution of an energy eigenstate; this is what we would get if no 
quench took place. The other terms are due to the quench. The surprise is the existence 
of the term that depends on (tı + t2). This is not time translationally invariant, even 
though both times are measured after t = 0. This means that the state carries a 
memory of the traumatic event that happened during the quench. 


2.3.3 The Adiabatic Approximation 


We now turn to the opposite limit, when the Hamiltonian changes very slowly. Here 
“slow” is again relative to the energy splitting h/AE, as we will see below. 


Consider a Hamiltonian H(A) which depends on some number of parameters àt. For 
simplicity, we will assume that H has a discrete spectrum. We write these states as 


H|n(A)) = En(A)|n(A)) (2.32) 


Let’s place ourselves in one of these energy eigenstates. Now vary the parameters A’. 
The adiabatic theorem states that if \ are changed suitably slowly, then the system 
will cling to the energy eigenstate |n(A(t))) that we started off in. 
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To see this, we want to solve the time-dependent Schrodinger equation 


leo) 
a = Hele) 


We expand the solution in a basis of instantaneous energy eigenstates, 


= Diam etm) mm (X(t))) (2.33) 


Here am(t) are coefficients that we wish to determine, while €,,,(t) is the usual energy- 


i dt Em(t 


) into the Schrödinger equation to find 


dependent phase factor 


(=; 


E: 
h 
33 


To proceed, we substitute our ansatz (2 


>. in em |mn(XA)) + am e” 2 moi] = 


m 


where we’ve cancelled the two terms which depend on Ep. Taking the inner product 
with (n(A)| gives 


in = = P ame A] mA) 
ian Ailà) à — p a en OE |m(A)) À’ (2.34) 


In the second line, we’ve singled out the m = n term and defined 


Ailà) = —i(n (2.35) 


, o 

ega 
(nl In) 
This is called the Berry connection. It plays a very important role in many aspects of 
theoretical physics, and we’ll see some examples in Section 2.3.4. 


First, we need to deal with the second term in (2.34). We will argue that this is 
small. To see this, we return to our original definition (2.32) and differentiate with 
respect to A, 


OH o OEm o 
Now take the inner product with (n| where n # m to find 
o OH 


(Em — En) (nlln) = (nl |) 
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This means that the second term in (2.34) is proportional to 


OH Ài 
an” Bn E, 


\i 


o 
(nl sh) À 


= (n) (2.36) 
The adiabatic theorem holds when the change of parameters \’ is much smaller than 
the splitting of energy levels Em — En. In this limit, we can ignore this term. From 
(2.34), we’re then left with 


This is easily solved to give 


t 
an = Chn exp g dt A;(A(t’)) i’) (2.37) 
0 
where C, are constants. 


This is the adiabatic theorem. If we start at time t = 0 with am = dmn, so the system 
is in a definite energy eigenstate |n), then the system remains in the state |n(A)) as 
we vary A. This is true as long as hài < AE, so that we can drop the term (2.36). 
In particular, this means that when we vary the parameters A, we should be careful 
to avoid level crossing, where another state becomes degenerate with the |n(A)) that 
we're sitting in. In this case, we will have Em = En for some |m) and all bets are off: 
when the states separate again, there’s no simple way to tell which linear combinations 
of the state we now sit in. 


However, level crossings are rare in quantum mechanics. In general, you have to tune 
three parameters to specific values in order to get two states to have the same energy. 
This follows by thinking about the a general Hermitian 2 x 2 matrix which can be 
viewed as the Hamiltonian for the two states of interest. The general Hermitian 2 x 2 
matrix depends on 4 parameters, but its eigenvalues only coincide if it is proportional 
to the identity matrix. This means that three of those parameters have to be set to 
Zero. 


2.3.4 Berry Phase 


There is a surprise hiding in the details of the adiabatic theorem. As we vary the 
parameters À, the phase of the state |n(A)) changes but there are two contributions, 


rather than one. The first is the usual “e~‘?t/)” 


phase that we expect for an energy 
eigenstate; this is shown explicitly in our original ansatz (2.33). But there is also a 


second contribution to the phase, shown in (2.37). 


— 54 - 


To highlight the distinction between these two contributions, suppose that we vary 
the parameters À but, finally we put them back to their starting values. This means 
that we trace out a closed path C in the space of parameters. The second contribution 
(2.37) can now be written as 


Mex ( $ dà AA) (2.38) 


In contrast to the energy-dependent phase, this does not depend on the time taken to 
make the journey in parameter space. Instead, it depends only on the path path we 
take through parameter space. 


Although the extra contribution (2.38) was correctly included in many calculations 
over the decades, its general status was only appreciated by Michael Berry in 1984. 
It is known as the Berry phase. It plays an important role in many of the more 
subtle applications that are related to topology, such as the quantum Hall effect and 
topological insulators. 


There is some very pretty geometry underlying the Berry phase. We can start to get 
a feel for this by looking a little more closely at the Berry connection (2.35). This is 
an example of a kind of object that you’ve seen before: it is like the gauge potential in 
electromagnetism! Let’s explore this analogy a little further. 


In the relativistic form of electromagnetism, we have a gauge potential A, (x) where 
u = 0,1,2,3 and x are coordinates over Minkowski spacetime. There is a redundancy 
in the description of the gauge potential: all physics remains invariant under the gauge 
transformation 


A, > A, = Ap + Ow (2.39) 


for any function w(x). In our course on Electromagnetism, we were learned that if we 
want to extract the physical information contained in A,,, we should compute the field 
strength 


dA, dA, 


~ ðr” ðr” 


Fw 


This contains the electric and magnetic fields. It is invariant under gauge transforma- 
tions. 
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Now let’s compare this to the Berry connection A;(A). Of course, this no longer 
depends on the coordinates of Minkowski space; instead it depends on the parameters 
àt. The number of these parameters is arbitrary; let’s suppose that we have d of them. 
This means that i = 1,...,d. In the language of differential geometry A;(A) is said to 
be a one-form over the space of parameters, while A,,(a) is said to be a one-form over 
Minkowski space. 


There is also a redundancy in the information contained in the Berry connection 
Ailà). This follows from the arbitrary choice we made in fixing the phase of the 
reference states |n(A)). We could just as happily have chosen a different set of reference 
states which differ by a phase. Moreover, we could pick a different phase for every choice 
of parameters À, 


A) =e” nA) 


for any function w(A). If we compute the Berry connection arising from this new choice, 
we have 


A; = -i(n'|—|n’) = A; + — (2.40) 
This takes the same form as the gauge transformation (2.39). 


Following the analogy with electromagnetism, we might expect that the physical 
information in the Berry connection can be found in the gauge invariant field strength 
which, mathematically, is known as the curvature of the connection, 

_ OA; B OA; 
-ðN AX 


It’s certainly true that F contains some physical information about our quantum sys- 


Fij(A) 


tem, but it’s not the only gauge invariant quantity of interest. In the present context, 
the most natural thing to compute is the Berry phase (2.38). Importantly, this too is 
independent of the arbitrariness arising from the gauge transformation (2.40). This is 
because $ Ojw dàt = 0. Indeed, we’ve already seen this same expression in the context 
of electromagnetism: it is the Aharonov-Bohm phase that we also met in the lectures 
on Solid State Physics. 


In fact, it’s possible to write the Berry phase in terms of the field strength using the 
higher-dimensional version of Stokes’ theorem 


e! = exp (-if Aw in’) = exp (-i f Fa as“) (2.41) 


where S is a two-dimensional surface in the parameter space bounded by the path C. 
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2.3.5 An Example: A Spin in a Magnetic Field 


The standard example of the Berry phase is very simple. It is a spin, with a Hilbert 
space consisting of just two states. The spin is placed in a magnetic field B. We met 
the Hamiltonian in this system when we discussed particles in a magnetic field in the 
lectures on Solid State Physics: it is 


H=-B-o+B 


where o are the triplet of Pauli vectors. We’ve set the magnetic moment of the particle 
to unity for convenience, and we’ve also added the constant offset B = |B] to this 
Hamiltonian to ensure that the ground state always has vanishing energy. This is so 
that the phase e~*”/" will vanish for the ground state and we can focus on the Berry 
phase that we care about. 


The Hamiltonian has two eigenvalues: 0 and +2B. We denote the ground state as 
||) and the excited state as |f), 


H|{)=0 and H|t) =2Bi|t) 
Note that these two states are non-degenerate as long as B ¥ 0. 


We are going to treat the magnetic field as the parameters, so that A’ = Bt in this 
example. Be warned: this means that things are about to get confusing because we'll 
be talking about Berry connections A; and curvatures F;; over the space of magnetic 
fields. (As opposed to electromagnetism where we talk about magnetic fields over 
actual space). 


The specific form of |) and ||) will depend on the orientation of B. To provide 
more explicit forms for these states, we write the magnetic field B in spherical polar 
coordinates 


Bsin@cos@ 
B= | Bsin@sind 
Bcoosé 


with 0 € [0,7] and ¢ € [0, 27) The Hamiltonian then reads 
cos9d—1 e?sin@ 
fae) 
et? sinf —cos@—1 


TEE 


In these coordinates, two normalised eigenstates are given by 


e~"? sin 0/2 e~*? cos 0/2 
IL) = and |f) = l 
— cos 0/2 sin 0/2 
These states play the role of our |n(A)) that we had in our general derivation. Note, 
however, that they are not well defined for all values of B. When we have 0 = 7, the 
angular coordinate @ is not well defined. This means that ||) and |f} don’t have 


well defined phases. This kind of behaviour is typical of systems with non-trivial Berry 
phase. 


We can easily compute the Berry phase arising from these states (staying away from 
0 = T to be on the safe side). We have 


Hie ee. o > [0 
Ao = ith 5glb)=0 and Ay = ith l) = -si (5) 
The resulting Berry curvature in polar coordinates is 
OAs OAs 
0¢ = 30 ad = sin @ 


This is simpler if we translate it back to cartesian coordinates where the rotational 
symmetry is more manifest. It becomes 


BE 


Fa\B) = -enaB 


But this is interesting. It is a magnetic monopole. Except now it’s not a magnetic 
monopole of electromagnetism. Instead it is, rather confusingly, a magnetic monopole 
in the space of magnetic fields. 


Note that the magnetic monopole sits at the point B = 0 where the two energy levels 
coincide. Here, the field strength is singular. This is the point where we can no longer 
trust the Berry phase computation. Nonetheless, it is the presence of this level crossing 
and the resulting singularity which is dominating the physics of the Berry phase. 


The magnetic monopole has charge g = —1/2, meaning that the integral of the Berry 
curvature over any two-sphere S? which surrounds the origin is 


f F; dS" = 4rg = —20 (2.42) 
S2 
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Figure 11: Integrating over S... Figure 12: ...or over S’. 


Using this, we can easily compute the Berry phase for any path C that we choose to 
take in the space of magnetic fields B. We only insist that the path C avoids the origin. 
Suppose that the surface S, bounded by C, makes a solid angle Q. Then, using the 
form (2.41) of the Berry phase, we have 


e’ = exp (-if Fij as ) = exp (5) (2.43) 
S 


Note, however, that there is an ambiguity in this computation. We could choose to 
form S as shown in the left hand figure. But we could equally well choose the surface 
S’ to go around the back of the sphere, as shown in the right-hand figure. In this case, 
the solid angle formed by S’ is Q! = 4r — NQ. Computing the Berry phase using S” gives 


oe sí —il4r- 2) ; 
e’ = exp (- f Fij as ) = exp (===) = e" (2.44) 


where the difference in sign in the second equality comes because the surface now has 
opposite orientation. So, happily, the two computations agree. Note, however, that 
this agreement requires that the charge of the monopole in (2.42) is 2g € Z. 


The discussion above is a repeat of Dirac’s argument for the quantisation of magnetic 
charge; this can also be found in the lectures on Solid State Physics and the lectures on 
Gauge Theory (where you’ll even find the same figures!). Dirac’s quantisation argument 
extends to a general Berry curvature F;; with an arbitrary number of parameters: the 
integral of the curvature over any closed surface must be quantised in units of 27, 


l Fj dS” = 27C (2.45) 


The integer C € Z is called the Chern number. 


You can read more about extensions of the Berry phase and its applications in the 
lectures on the Quantum Hall Effect. 
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2.3.6 The Born-Oppenheimer Approximation 
“T couldn’t find any mistake - did you really do this alone?” 


Oppenheimer to his research supervisor Max Born 


The Born-Oppenhemier approximation is an approach to solving quantum mechan- 
ical problems in which there is a hierarchy of scales. The standard example is the a 
bunch of nuclei, each with position Ra mass Ma and charge Zae, interacting with a 
bunch of electrons, each with position r;, mass m and charge —e. The Hamiltonian is 


h2 h2 e2 1 ZZ 
B= Dam, etm + ira eee eos rea a) 


This simple Hamiltonian is believed to describe much of what we see around us in 


the world, so much so that some condensed matter physicists will refer to this, only 
half-jokingly, as the “theory of everything”. Of course, the information about any 
complex system is deeply hidden within this equation, and the art of physics is finding 
approximation schemes, or emergent organising principles, to extract this information. 


The hierarchy of scales in the Hamiltonian above arises because of the mass difference 
between the nuclei and the electrons. Recall that the proton-to-electron mass ratio is 
Mp/Me œ% 1836. This means that the nuclei are cumbersome and slow, while the 
electrons are nimble and quick. Relatedly, the nuclei wavefunctions are much more 
localised than the electron wavefunctions. This motivates us to first fix the positions 
of the nuclei and look at the electron Hamiltonian, and only later solve for the nuclei 
dynamics. This is the essence of the Born-Oppenheimer approximation. 


To this end, we write 
H = Hauc + Ha 


where 


h? Zag 
Haug = 
> Ma. oe [Ra — Re 
and 


is 
Hae Don tae = (Pay lr; — rj] p lr; — = 


We then solve for the eigenstates of He, where the nuclei positions R are viewed as 
parameters which, as in the adiabatic approximation, will subsequently vary slowly. 
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The only difference with our previous discussion is that the time evolution of R is 
determined by the dynamics of the system, rather than under the control of some 
experimenter. 


For fixed R, the instantaneous electron wavefunctions are 


Ha bn(t; R) = €n(R) ġn(r; R) 


In what follows, we will assume that the energy levels are non-degenerate. (There is 
an interesting generalisation if there is a degeneracy which we will not discuss in these 
lectures.) We then make the ansatz for the wavefunction of the full system 


= 2 Pal R)dn(r; R) 


We'd like to write down an effective Hamiltonian which governs the nuclei wavefunctions 
®,,(R). This is straightforward. The wavefunction UV obeys 


(Hrid + Ha) = EW 
Switching to bra-ket notation for the electron eigenstates, we can write this as 


n 


Now Hwa contains the kinetic term VR, and this acts both on the nuclei wavefunction 
n, but also on the electron wavefunction ¢,(r;R) where the nuclei positions sit as 
parameters. We have 


(¢mIVRPnIbn) = J (Emr + (mI VRI¢R)) (din VR + (bel Vee1bn)) ® 
k 


We now argue that, as in Section 2.3.3, the off-diagonal terms are small. The same 
analysis as in (2.36) shows that they can be written as 


(dn|(VrHer)| bx) |” 


En — €k 


N (nl Velox) (onl Valon) = > 


In the spirit of the adiabatic approximation, these can be neglected as long as the 
motion of the nuclei is smaller than the splitting of the electron energy levels. In this 
limit, we get a simple effective Hamiltonian for the nuclei (2.46). The Hamiltonian 
depends on the state |¢,) that the electrons sit in, and is given by 


Ht = S09, —iAna)* D e„(R) 
n LM a) oe IR, — Ral ` 
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We see that the electron energy level €,,(R) acts as an effective potential for the nuclei. 
Perhaps more surprisingly, the Berry connection 


Ana = —idon|VRe lon) 


also makes an appearance, now acting as an effective magnetic field in which the nuclei 
Ra moves. 


The idea of the Born-Oppenheimer approximation is that we can first solve for the 
fast-moving degrees of freedom, to find an effective action for the slow-moving degrees 
of freedom. We sometimes say that we have “integrated out’ the electron degrees of 
freedom, language which really comes from the path integral formulation of quantum 
mechanics. This is a very powerful idea, and one which becomes increasingly important 
as we progress in theoretical physics. Indeed, this simple idea underpins the Wilsonian 
renormalisation group which we will meet in later courses. 


2.3.7 An Example: Molecules 


The Born-Oppenheimer approximation plays a key role in chemistry (and, therefore, 
in life in general). This is because it provides quantitative insight into the formation of 
covalent bonds, in which its energetically preferable for nuclei to stick together because 
the gain in energy from sharing an electron beats their mutual Coulomb repulsion. 


The simplest example is the formation of the hydrogen molecule H, , consisting of 
two protons and a single electron. If we fix the proton separation to R, then the 
resulting Hamiltonian for the electrons is 


K e fi 1 
H=- V- 
i TM ATEo ; 7 = 


To proceed, we will combine the Born-Oppenheimer approximation with the variational 
method that we met in Section 2.1. Our ultimate goal is simply to show that a bound 
state exists. For this, the effective potential energy is much more important than the 
Berry connection. We will consider two possible ansatz for the electron ground state 


g(r) = A+ (volt) + vole — R)) 


where 
1 : 
Yo = \/ ae 
TAG 
is the ground state wavefunction of hydrogen, which has energy Ey = —e?/87€ 940. 
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f L R 
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Figure 13: The potential for Y, Figure 14: The potential for Y_ 


Although Yọ is normalised, the full wavefunction ¢4 is not. The normalisation condition 


gives 


1 


ae h + f r wait- R) j 


This is the first of several, rather tedious integrals that we have in store. They can all be 
done using the kind of techniques that we introduced in Section 2.1.2 when discussing 
helium. Here I’ll simply state the answers. It turns out that 


3 R R? —R/a 
u(R) = | Priori -—R) = (1+—+ 7,5 ]e" 
ao = 3ap 
Moreover, we'll also need 


v(R) = | ar vo(rvo(r-R) _ 1 (: | A Jetra 


ao 


w(R) = f ër Yo(r)? 1 1 (: | Bani 


ao 


The expected energy in the state V(r) can be calculated to be 


ex(R) = (b4|Halés) = Eo — 24% (w(R) + v(R)) 


This means that the nuclei experience an effective potential energy given by 


of BNE- — & (1 wta 
W B= 4TeoR telk) = 4TEo (3 1+u(R) ) aa 


This makes sense: as R — oo, we get Vag — Eo, which is the energy of a hydrogen atom. 


e 


Above, we have sketched the effective potential V$“ — Ep for the two wavefunctions ¢ 1. 


We see that the state @, gives rise to a minimum below zero. This is the indicating 
the existence of a molecular bound state. In contrast, there is no such bound state for 
o_. This difference is primarily due to the fact that ø} varies more slowly and so costs 
less kinetic energy. 


— 63 — 


3. Atoms 
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The periodic table is one of the most iconic images in science. All elements are classified 
in groups, ranging from metals on the left that go bang when you drop them in water 
through to gases on the right that don’t do very much at all. 


However, the periodic table contains plenty of hints that it is not the last word in 
science. There are patterns and order that run through it, all hinting at some deeper 
underlying structure. That structure, we now know, is quantum mechanics. 


The most important pattern is also the most obvious: the elements are ordered, 
labelled by an integer, Z. This is the atomic number which counts the number of 
protons in the nucleus. The atomic number is the first time that the integers genuinely 
play a role in physics. They arise, like most other integers in physics, as the spectrum 
of a particular Schrödinger equation. This equation is rather complicated and we 
won’t describe it in this course but, for what it’s worth, it involves a Hamiltonian 
which describes the interactions of quarks and is known as the theory of quantum 
chromodynamics. 


While the atomic number is related to the quantum mechanics of quarks, all the 
other features of the periodic table arise from the quantum mechanics of the electrons. 
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The purpose of this section is to explain some of the crudest features of the table 
from first principles. We will answer questions like: what determines the number of 
elements in each row? Why are there gaps at the top, and two rows at the bottom 
that we can’t fit in elsewhere? What’s special about the sequence of atomic numbers 
2, 10,18, 26, 54, 86,... that label the inert gases? 


We will also look at more quantitative properties of atoms, in particular their energy 
levels, and the ionization energy needed to remove a single electron. In principle, all of 
chemistry follows from solving the Schrodinger equation for some number of electrons. 
However, solving the Schrodinger equation for many particles is hard and there is a 
long path between “in principle” and “in practice”. In this section, we take the first 
steps down this path. 


3.1 Hydrogen 


We're going to start by looking at a very simple system that consists of a nucleus with 
just a single electron. This, of course, is hydrogen. 


Now I know what you’re thinking: you already solved the hydrogen atom in your 
first course on quantum mechanics. But you didn’t quite do it properly. There are a 
number of subtleties that were missed in that first attempt. Here we’re going to explore 
these subtleties. 


3.1.1 A Review of the Hydrogen Atom 


We usually treat the hydrogen atom by considering an electron of charge —e orbiting 
a proton of charge +e. With a view to subsequent applications, we will generalise this 
slightly: we consider a nucleus of charge Ze, still orbited by a single electron of charge 
—e. This means that we are also describing ions such as He* (for Z = 2) or Li?+ (for 
Z = 3). The Hamiltonian is 


ee ee (3.1) 


The mass m is usually taken to be the electron mass me but since this is a two-body 
problem it’s more correct to think of it as the reduced mass. (See, for example, Section 
5.1.5 of the lectures on Dynamics and Relatvity.) This means that m = m.M/(m. + 
M) ~ me —m?2/M where M is the mass of the nucleus. The resulting m is very close 
to the electron mass. For example, for hydrogen where the nucleus is a single proton, 
M = my ~ 1836m.. 
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The Schrodinger equation is the eigenvalue problem 


Hy = En 
This is the problem that you solved in your first course. The solutions are 
Wnt mits 0,0) = Rnalr)Yim(0, Q) (3.2) 


where R,, (7) are the (generalised) Laguerre polynomials and Y;,,,,(6,@) are spherical 
harmonics. with energy eigenvalues. The states are labelled by three quantum numbers, 
n, l and m, which take integer values in the range 


n = 1,2,3,... , l=0,1,... tel. , m=-—l,...,+l 


(Don’t confuse the quantum number m with the mass m! Both will appear in formulae 
below, but it should be obvious which is which.) Importantly, the energy eigenvalue 
only depends on the first of these quantum numbers n, 
Ze\? m 1 
En, = — — nEZ 
(=) 2h? n? 


where, just in case you weren't sure, it’s the mass m that appears in this formula. This 


is sometimes written as 


where Ry ~ 13.6 eV is known as the Rydberg energy; it is the binding energy the 

ground state of hydrogen. Alternatively, it is useful to write the energy levels as 
(Zame e? 

2n? Weert Ar eghe 


This may appear slightly odd as we’ve introduced factors of the speed of light c which 


(3.3) 


subsequently cancel those in a. Writing it this way means that we can immediately 
see how the binding energies compare to the rest mass energy mc? of the electron. The 
quantity @ is dimensionless and take the value a + 1/137. It is called the fine structure 
constant, a name that arises because it was first introduced in the calculations of the 
“fine structure” of hydrogen that we will see below. The fine structure constant should 
be thought of as the way to characterise the strength of the electromagnetic force. 


Some Definitions 


This energy spectrum can be seen experimentally as spectral lines. These are due to 
excited electrons dropping from one state n to a lower state n’ < n, emitting a photon 
of fixed frequency hw = En — Ew. When the electron drops down to the ground state 
with n’ = 1, the resulting lines are called the Lyman series. When the electron drops 
to higher states n’ > 1, the sequences are referred to as Balmer, Paschen and so on. 
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Instead of using the angular momentum quantum number / to label the state, they 
are sometimes referred to as letters. | = 0, 1,2,3 are called s, p, d and f respectively. 
The names are old fashioned and come from the observed quality of spectral lines; they 
stand for sharp, principal, diffuse and fundamental, but they remain standard when 
describing atomic structure. 


Degeneracy 


The fact that the energy depends only on n and not on the angular momentum quantum 
numbers l and m means that each energy eigenvalue is degenerate. For fixed l, there 
are 21 + 1 states labelled by m. Which means that for a fixed n, the total number of 
states is 


n-1 


Degeneracy = 2 A+ =n? 
1=0 


Moreover, each electron also carries a spin degree of freedom. Measured along a given 
axis, this spin can either be up (which means m, = 4) or down (ms = —$). Including 
this spin, the total degeneracy of states with energy En is 


Degeneracy = 2n? 


The main reason for revisiting the quantum mechanics of hydrogen is to understand 
what becomes of this degeneracy. Before we proceed, it’s worth first thinking about 
where this degeneracy comes from. Usually in quantum mechanics, any degeneracy is 
related to a conservation law which, in turn, are related to symmetries. The hydrogen 
atom is no exception. 


The most subtle degeneracy to explain is the fact that the energy does not depend on 
l. This follows from the fact that the Hamiltonian (3.1) has a rather special conserved 
symmetry known as the Runge-Lenz vector. (We’ve met this in earlier courses in 
classical and quantum mechanics.) This follows, ultimately, from a hidden SO(4) 
symmetry in the formulation of the hydrogen atom. We therefore expect that any 
deviation from (3.1) will lift the degeneracy in l. 


Meanwhile, the degeneracy in m follows simply from rotational invariance and the 
corresponding conservation of angular momentum L. We don’t, therefore, expect this 
to be lifted unless something breaks the underlying rotational symmetry of the problem. 


6 /.= 


Finally, the overall factor of 2 comes, of course, from the spin S. The degeneracy 
must, therefore, follow from the conservation of spin. Yet there is no such conservation 
law; spin is just another form of angular momentum. The only thing that is really 
conserved is the total angular momentum J = L + S. We would therefore expect any 
addition to the Hamiltonian (3.1) which recognises that only J is conserved to lift this 
spin degeneracy. 


We’ll now see in detail how this plays out. As we’ll show, there are a number of 
different effects which split these energy levels. These effects collectively go by the 
name of fine structure and hyperfine structure. 


3.1.2 Relativistic Motion 


The “fine structure” corrections to the hydrogen spectrum all arise from relativistic 
corrections. There are three different relativistic effects that we need to take into 
account: we will treat the first here, and the others in Sections 3.1.3 and 3.1.4 


You can run into difficulties if you naively try to incorporate special relativity into 
quantum mechanics. To do things properly, you need to work in the framework of 
Quantum Field Theory and the Dirac equation, both of which are beyond the scope 
of this course. However, we’re only going to be interested in situations where the 
relativistic effects can be thought of as small corrections to our original result. In 
this situation, it’s usually safe to stick with single-particle quantum mechanics and use 
perturbation theory. That’s the approach that we’ll take here. Nonetheless, a number 
of the results that we’ll derive below can only be rigorously justified by working with 
the Dirac equation. 


The first, and most straightforward, relativistic shift of the energy levels comes simply 
from the fact that the effective velocity of electrons in an atom is a substantial fraction 
of the speed of light. Recall that the energy of a relativistic particle is 

2 4 
p p 
E= 202 24 ow hoa 
PESEE T 2m 8m? 
The constant term mc? can be neglected and the next term is the usual non-relativistic 
kinetic energy which feeds into the Hamiltonian (3.1). Here we’ll treat the third term 
as a perturbation of our hydrogen Hamiltonian 


p! 
AH = -—— 3.4 

8m3? ee) 

At first glance, it looks as if we’re going to be dealing with degenerate perturbation 
theory. However, this particular perturbation is blind to both angular momentum 
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quantum numbers / and m, as well as the spin m,. This follows straightforwardly from 
the fact that [AH, L?] = [AH, L.| = 0. If we denote the states (3.2) as |nlm), then it’s 
simple to show that 


(nlm|AH|nl'm’) =0 unless 1 =I! and m = m’ 
This means that the energy shifts are 
(AF) a = (AA aa 


where we’re introduced the notation (AH),; = (nlm|AH|nlm) and we’ve used the 
fact that the perturbation preserves S'O(3) rotational invariance to anticipate that the 
change of energy won’t depend on the quantum number m. We want to compute this 
overlap. In fact, it’s simplest to massage it a little bit by writing 


1 


2mc?2 


A= [H — V(r)? 


where V(r) = Ze? /4reor. This gives us the expression 


1 
2mc? 


(AE) = — [En — 2En (V (r)a + (V(r) ni] (3.5) 


and our new goal is to compute the expectation values (1/r)n, and (1/r?),. for the 
hydrogen atom wavefunctions. 


The first of these follows from the virial theorem (see Section 2.1.3) which tells us that 


the relative contribution from the kinetic energy and potential energy is 2(T) = —(V), 
so that (E) = (T) + (V) = (V). Then, 
1 1 1 Z 1 
os Zoho Pnl Zahe ag n? (el 


where ao = h/amce is the Bohr radius, the length scale characteristic of the hydrogen 
atom. 


Next up is (1/r?). Here there’s a cunning trick. For any quantum system, if we took 
the Hamiltonian H and perturbed it to H + A/r?, then the leading order correction to 
the energy levels would be (A/r?). But, for the hydrogen atom, such a perturbation 
can be absorbed into the angular momentum terms, 

2 2 yt 
Ae (1+1) À = Ae Vl’ +1) 


2 j 


2n r 2m T 
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But this is again of the form of the hydrogen atom Hamiltonian and we can solve 
it exactly. The only difference is that l’ is no longer an integer but some function 
I'(\). The exact energy levels of the Hamiltonian with 7’ follow from our first course 
on quantum mechanics: they are 
1 
EUV) = -m° (Za ———— 

Usually we would define the integer n = k +l + 1 to get the usual spectrum Ep given 
in (3.3). Here, instead, we Taylor expand F(A) around A = 0 to get 


ih al 


areal 


A=0 


E(t’) = En + (Zame | 


e a 
” a n(A+1 0 7 
From this we can read off the expectation value that we wanted: it is the leading 
correction to our exact result, 


I Z? 2 
ie SE 37 
(3 7 ag n3(21 + 1) n 
The two expectation values (3.6) and (3.7) are what we need to compute the shift of 
the energy levels (3.5). We have 


— (Zame n 3\ 1 
(AB nt = 9 Cae. oo 


As anticipated above, the relativistic effect removes the degeneracy in the quantum 


number L. 


Notice that the size of the correction is of order (Za)*. This is smaller than the 
original energy (3.3) by a factor of (Za)*. Although we may not have realised it, (Za)? 
is the dimensionless ratio which we’re relying on to be small so that perturbation theory 
is valid. (Or, for higher states, (Za/n)?). 


It’s worth asking why we ended up with a perturbation to the energy which is smaller 
by a factor of (Za)?. Because this was a relativistic correction, we expect it to be of 
order v?/c? where v is the characteristic velocity of the electron. We can understand 
this by invoking the virial theorem which, in general, states that the expectation value 
of the kinetic energy (T) is related to the expectation value of the energy V ~ r” by 
2(T) = n(V). For the hydrogen atom, this means that (T) = $m(v?) = —3(V). Since, 
from the ground state energy (3.3), we know that E, = (T) + (V) = mc?(Za)?/2 we 
have (v?) = (Za)*c? which confirms that (Za)? is indeed the small parameter in the 
problem. 


= 70 - 


3.1.3 Spin-Orbit Coupling and Thomas Precession 


The second shift of the energy levels comes from an interaction between the electron 
spin S and its angular momentum L. This is known as the spin-orbit coupling. 


The first fact we will need is that spin endows the electron with a magnetic dipole 
moment given by 
e 


m= -g (3.9) 


2m 
The coefficient of proportionality is called the gyromagnetic ratio or, sometimes, just 
the g-factor. To leading order g = 2 for the electron. This fact follows from the Dirac 
equation for the electron. We won't derive this here and, for now, you will have to take 
this fact on face value. A more precise analysis using quantum field theory shows that 
g receives small corrections. The current best known value is g = 2.00231930436182..., 
but we'll stick with g = 2 in our analysis below. 


The second fact that we need is that the energy of a magnetic moment m in a 
magnetic field B is given by 
E=-B.m 
This is something we derived in Section 3 of the lectures on Electromagnetism. 


The final fact is the Lorentz transformation of the electric field: as electron moving 
with velocity v in an electric field E will experience a magnetic field 


B=-vxE 
c 
This was derived in Section 5 of the lectures on Electromagnetism. 


We now apply this to the electron in orbit around the nucleus. The electron expe- 
riences a radial electric field given by E = —V¢(r) with ọ(r) = Ze/4reor. Putting 
everything together, the resulting magnetic field interacts with the spin, giving rise to 
a correction to the energy of the electron 
-o l prie e 10$. 
(mc)? Or (mc)? r ðr 


where p = myv is the momentum and L = r x p is the angular momentum. This is the 


S 


promised spin-orbit coupling, in a form which we can promote to an operator. Thus 

the spin-orbit correction to the Hamiltonian is 

e 10 
Or. 


Except. ... 


=F = 


Thomas Precession 


It turns that the interaction (3.10) is actually incorrect by a factor of 1/2. This is 
because of a subtle, relativistic effect known as Thomas precession. 


Thomas precession arises because the electron orbiting the nucleus is in a non-inertial 
frame. As we will now explain, this means that even if the electron experienced no 
magnetic field, its spin would still precess around the orbit. 


The basic physics follows from the structure of the Lorentz group. (See Section 7 
of the lectures on Dynamics and Relativity.) Consider a Lorentz boost A(v) in the 
x-direction, followed by a Lorentz boost A‘(v’) in the y-direction. Some simple matrix 
multiplication will convince you that the resulting Lorentz transformation cannot be 
written solely as a boost. Instead, it is a boost together with a rotation, 


A'(v')A(v) = ROJA” (v”) 


where A”(v”) is an appropriate boost while r(@) is a rotation in the x — y plane. 
This rotation is known as the Wigner rotation (or sometimes the Thomas rotation). 
Although we will not need this fact below, you can check that cos 6 = (y+7')/(77/ +1) 
with y and 7’ the usual relativistic factors. 


Now we’re going to apply this to a classical electron in orbit around the nucleus. At 
a fixed moment in time, it is moving with some velocity v relative to the nucleus. At 
some moment of time later, v + dv. The net effect of these two boosts is, as above, a 
boost together with a rotation. 


If the electron were a point particle, the Wigner rota- Vv 
tion would have no effect. However, the electron is not a Ð hy 
point particle: it carries a spin degree of freedom S and this < -> 
is rotated by the Wigner/Thomas effect. The cumulative 
effect of these rotations is that the spin precesses as the Figure 15: 
electron orbits the nucleus. We would like to calculate how 
much. 


The correct way to compute the precession is to integrate up the consecutive, in- 
finitesimal Lorentz transformations as the electron orbits the nucleus. Here, instead, 
we present a quick and dirty derivation. We approximate the circular orbit of the 
electron by an N-sided polygon. Clearly in the lab frame, at the end of each segment 
the electron shifts it velocity by an angle 0 = 27/N. However, in the electron’s frame 
there is a Lorentz contraction along the direction parallel to the electron’s motion. This 


a 


means that the electron thinks it rotates by the larger angle tan 0’ = x/(y/y) which, 
for N large, is 6’ ~ 277/N. The upshot it that, by the time the electron has completed 
a full orbit, it thinks that it has rotated by an excess angle of 


Qrv? 
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where we have expanded the relativistic factor y = (1 — v?/c?)~/? x 14 v?/2c?. 


This is all we need to determine the precession rate, wr. If the particle traverses the 
orbit with speed v and period T, then 
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where, in the final step, we’ve replaced the period T with the acceleration a = v? /R = 
anu 7. 


Our derivation above tells us the angular precession. But what does this mean for a 
vector like S? A little thought shows that the component of S that lies perpendicular 
to the plane of the orbit remains unchanged, while the component that lies within the 
plane precesses with frequency wr. In other words, 


hs) x 
ne with wr = y 


(3.11) 
This is Thomas precession. The effect is purely kinematic, due to the fact that the 
electron is not in an inertial frame. It can be thought of as a relativistic analog of the 
Coriolis force. 


Finally, note that in several places above, we needed the assumption that v/c is 
small. Correspondingly, our final result (3.11) is only the leading order answer. The 
correct answer turns out to be 
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However, (3.11) will suffice for our purposes. 


Thomas Precession and the Spin-Orbit Coupling 


Let’s now see how the existence of Thomas precession affects the spin orbit coupling. 
Again, we'll start with some basics. Classically, the energy E = —(e/m)B-S means 
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that a spin will experience a torque when placed in a magnetic field. This, in turn, will 
cause it to precesss 


as 
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However, we’ve seen that Thomas precession (3.11) gives a further contribution to this. 
So the correct equation should be 
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The energy functional which gives rise to this is 
e 
E= —B.S+wr:S 
m 


Working to leading order in v/c, we massage the second term as 
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where we’ve used Newton’s second law to write ma = eVo. We see that comes with the 
opposite sign and half the magnitude of the original contribution (3.10) to the energy. 
Adding the two together gives the final result for the correction to the Hamiltonian due 
to the spin-orbit coupling 


e 10¢ 


with ¢(r) the electrostatic potential which, for us, is ¢ = Ze/4reor. 


Computing the Spin-Orbit Energy Shift 


Before our perturbation, the electron states were labelled by |nlm), together with 
the spin +1/2. The spin-orbit coupling will split the spin and angular momentum l 
degeneracy of the spectrum. To anticipate this, we should label these states by the 
total angular momentum 


J=L+5 


which takes quantum numbers j = l + 1/2 with l = 0,1,.... (When l = 0, we only 
have j = 1/2.) Each state can therefore be labelled by |n, j, mj; l) where |m,;| < j and 
the additional label / is there to remind us where these states came from. 
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We want to compute the eigenvalue of L-S acting on these states. The simplest way 
to do this is to consider J? = L? + S? + 2L - S, which tells us that 
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As in Section 3.1.2, when computing degenerate perturbation theory with |n, j, mj; l}, 
the off-diagonal matrix elements vanish. We are left with the shift of the energy eigen- 
values given by 


(AEs) a jit = (AH s0)n,j:t 
where (AH 0) n,j;1 = (n, j, mj; AH soln, j, mj; 1). 


With AHso given in (3.10), and ¢(r) = Ze/4reor, the shift of energy levels are 


Zeh —(l + 1) 1 
(AEə)n, ji = ~ 4 Ameo(me)2 l | an 


where, as in (3.13), the upper entry in {-} corresponds to j = l — 5 (with | # 0) and 
the lower entry corresponds to j = l + L. Note that when l = 0, we have AF» = 0 
because there is no angular momentum for the spin to couple to. 


In the previous section, we needed to compute (1/r) and (1/r?). We see that now 
we need to compute (1/r?). Once again, there is a cute trick. This time, we introduce 
a new “radial momentum” observable 


P „/ð A 
p=-n( 2+1) 


It’s simple to check that the radial part of the Hamiltonian can be written as 


n=- e) APU(L+ 1) | Ze? 
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A quick computation shows that 
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Clearly this commutator doesn’t vanish. However, when evaluated on an energy eigen- 
state, we must have ([p, H])nj, = 0. From our expression above, this tells us that 


20 7 T 1) oe E (2) tis or +1) 5 (F70) 


where we’ve used our earlier result (3.7) and, as before, ag = h/amc is the Bohr radius. 


Putting this together, and re-writing the resulting expression in terms of 7 rather than 
l, we find that the shift of energy levels due to spin-orbit coupling is 


(Zame | -+= 1 1 
AF) ni. = —— j+1 = 
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This is the same order of magnitude as the first fine-structure shift (3.8) which, re- 
written in terms of j = l + z, becomes 
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Combining these results, we get an expression which happily looks the same regardless 


of the minus sign in j = l + L. It is 


(Zame ( 3 2 1 
A Ei) AF) n 5-1 = — — 3.14 
( 1) at ( 2) Jl 2 An +1 n3 ( ) 
where we should remember that for l = 0, (AFE2)n, jı = 0 and we only get the (AE})n1 
term. 


3.1.4 Zitterbewegung and the Darwin Term 


There is one final contribution to the fine structure of the hydrogen atom. This one 
is somewhat more subtle than the others and a correct derivation really requires us to 
use the Dirac equation. Here we give a rather hand-waving explanation. 


One of the main lessons from combining quantum mechanics with special relativity is 
that particles are not point-like. A particle of mass m has a size given by the Compton 
wavelength, 


ja 
me 


For the electron, À ~ 3 x 1071 cm. Roughly speaking, if you look at a distance smaller 
than this you will see a swarm of particle and anti-particles and the single particle that 
you started with becomes blurred by this surrounding crowd. 
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Quantum field theory provides the framework to deal with this. However, within the 
framework of quantum mechanics it is something that we have to put in by hand. In 
this context, it is sometimes called Zitterbewegung, or “trembling motion”. Suppose 
that a particle moves in a potential V(r). Then, if the particle sits at position ro, it 
will experience the average of the potential in some region which is smeared a distance 


~ À around rp. To include this, we Taylor expand the potential 
OV OV 
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By rotational symmetry, (Ar) = 0. Meanwhile, we take 
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I don’t have an argument for the factor of 1/2 on the right-hand-side of this expectation 
value. You will have to resort to the Dirac equation to see this. This gives a further 
contribution to the Hamiltonian, known as the Darwin term 


AHparwin = 8 za V (3.15) 
For the Coulomb potential, this becomes 
Zah? 
A Hbparwin = Rite A4r6 (r) 


However, all wavefunctions with l > 0 are vanishing at the origin and so unaffected by 
the Darwin term. Only those with l = 0, have a correction to their energy given by 


Zahn 
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The normalised wavefunction takes the form Vnim(r) = Rni(r)Yim(6,¢). For l= 0, we 
have Yoo = 1/v 4r and the radial wavefunction take the form 


Rate a ( ae ) (n= D -r/n00 £1 (27 nag) 


nag) 2n(n!)3 


(AE3)nJ E (AE pian ial = 


Now we need to dig out some properties of Laguerre polynomials. We will need the 
facts that L(x) = dL, (x)/dz and L,(x) ~ n! — nlnx + O(2?) so that L1(0) = n!n. 
The wavefunction at the origin then becomes 


Z3 
2 
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From this we get 
Za)*me? 1 
(AF3)nj = ( : 73010 (3.17) 
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A Combined Spin-Orbit-Darwin Term 


Our derivation of the spin-orbit term (3.12), including Thomas precession, and the 
Darwin term (3.15) was somewhat involved and, at times, a little hand-wavy. In fact, 
there’s a simple way to combine these two expressions which, ultimately, fits nicely 
with the Dirac equation. We claim that the combined expression for the fine structure 
can be written as 


AH = AHso + AH Darwin = -lo - p, [lo - p, V(r)]] (3.18) 
Here ø = (ot, 070°] are the three Pauli matrices and are related to the spin operator 
by S = $ho. Note that, other than the usual kinetic energy, the term (3.18) is the only 
other term that we can write down that is quadratic in momentum and involves only 
the spin matrices S and the potential. The factor of 1/m?c? is fixed on dimensional 
grounds but the overall coefficient of 1/8 is not: you have to do one of the calculations 
above to fix this. 


Let’s now show that (3.18) does indeed reproduce the combined spin-orbit and Dar- 
win couplings as claimed. Expanding, we have 


lo -p,|o- p, V]] = pV -Vp — 20 -pVo -p 
= -hV’V —4(VV x p)-S 


where, in going to the second line, we’ve used a'a/ = 64 + ie*o*, together with the 
usual; operator expressions p = —iAV and S = sho. We recognise the first term as 
the Darwin contribution (3.15) (up to an overall constant). For the second term, we 
need the fact that V(r) is a central potential, with VV = (dV/dr)r. A little algebra 
shows that this then coincides with the spin-orbit term (3.12), with L = r x p and the 
potential energy V related to the electrostatic potential as V = ed. Again, we stress 
that we need one of our previous arguments to fix the overall coefficient of —1/8 in 
(3.18), but this form fixes the relative coefficient between spin-orbit and Darwin. 


3.1.5 Finally, Fine-Structure 


It’s been quite a long journey. Our fine structure calculations have revealed three 
contributions, the first two given by (3.14) and the third by (3.17). Recall that the 
spin-orbit coupling in (3.14) gave vanishing contribution when / = 0. Rather curiously, 
the Darwin term gives a contribution only when l = 0 which coincides with the formal 
answer for the spin-orbit coupling when / = 0, j = 1/2. The upshot of this is that the 
answer (3.14) we found before actually holds for all J. In other words, adding all the 
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contributions together, (AF),,; = (AF\)ny + (AE2)n ja + (AE3)nu, we have our final 
result for the fine structure of the hydrogen atom 


(Za)*me? (3 2 1 
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We learn that the energy splitting depends only on j. This didn’t have to be the case. 
There is no symmetry that requires states with j = |l + 5| to have the same energy. 
We refer to this as an accidental degeneracy. Meanwhile, the energy of each state is 
independent of the remaining angular momentum quantum number m < l. This is not 
accidental: it is guaranteed by rotational invariance. 


To describe the states of hydrogen, we use the notation n##; where we replace # 
with the letter that denotes the orbital angular momentum /. The ground state is then 
1s,/2. This is doubly degenerate as there is no angular momentum, so the spin states 
are not split by spin-orbit coupling. The first excited states are 2s1/2 (two spin states) 
which is degenerate with 2p,/2 (three angular momentum states). Similarly, as we go 
up the spectrum we find that the 3p3/2 and 3d3/2 states are degenerate and so on. 


The Result from the Dirac Equation 


Our fine structure calculations have all treated relativistic effects perturbatively in 
v?/c?. As we explained in Section 3.1.2, for the hydrogen atom this is equivalent to an 
expansion in 1/(Za)?. In fact, for this problem there is an exact answer. The derivation 
of this requires the Dirac equation and is beyond the scope of this course; instead we 
simply state the answer. The energy levels of the relativistic hydrogen atom are given 


by 


2; —1/2 


Eng = me? |14 (3.19) 


Expanding in 1/(Za) gives 
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The first term is, of course, the rest mass of the electron. The second term is the 


usual hydrogen binding energy, while the final term is the fine structure corrections 
that we’ve laboriously computed above. 
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The Lamb Shift 


It turns out that the “exact” result (3.19) is not exact at all! In 1947, Willis Lamb 
reported the experimental discovery of a splitting between the 2812 and 2pj/2 states. 
For this, he won the 1955 Nobel prize. The effect is now referred to as the Lamb shift. 


The Lamb shift cannot be understood using the kind of single-particle quantum 
mechanics that we’re discussing in this course. It is caused by quantum fluctuations 
of the electromagnetic field and needs the full machinery of quantum field theory, 
specifically quantum electrodymamics, or QED for short. Historically the experimental 
discovery of the Lamb shift was one of the prime motivations that led people to develop 
the framework of quantum field theory. 


3.1.6 Hyperfine Structure 


Both the fine structure corrections and the QED corrections treat the nucleus of the 
atom as a point-like object. This means that, although the corrections are complicated, 
the problem always has rotational symmetry. 


In reality, however, the nucleus has structure. This structure effects the atomic 
energy levels, giving rise to what is called hyperfine structure. There are a number of 
different effects that fall under this heading. 


The most important effects come from the magnetic dipole moment of the nucleus. 
Each constituent neutron and proton is a fermion, which means that they have an 
internal intrinsic spin 1/2. This is described by the quantum operator I. This, in turn, 
gives the nucleus a magnetic dipole moment 

Ze I 
My = gvn—— 
N = 9N 2M 
This takes the same form as (3.9) for the electron magnetic moment. Here M is the 
mass of the nucleus while gy is the nucleus gyromagnetic factor. 


The Dirac equation predicts that every fundamental fermion has g = 2 (plus some 
small corrections). However, neither the proton nor the neutron are fundamental par- 
ticles. At a cartoon level, we say that each is made of three smaller particles called 
quarks. The reality is much more complicated! Each proton and neutron is made of 
many hundreds of quarks and anti-quarks, constantly popping in an out of existence, 
bound together by a swarm of further particles called gluons. It is, in short, a mess. 
The cartoon picture of each proton and neutron containing three quarks arises because, 
at any given time, each contains three more quarks than anti-quarks. 
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The fact that the protons and neutrons are not fundamental first reveals itself in 
their anomalously large gyromagnetic factors. These are 


Qproton © 5.56 and  gneutron ~ —3.83 


The minus sign for the neutron means that a neutron spin precesses in the opposite 
direction to a proton spin. Moreover, the spins point in opposite directions in their 
ground state. 


Now we can describe the ways in which the nuclear structure affects the energy levels 
of the atom 


e Both the electron and the nucleus carry a magnetic moment. But we know from 
our first course on Electromagnetism that the is an interaction between nearby 
magnetic moments. This will lead to a coupling of the form I-S between the 
nucleus and electron spins. 


e The orbital motion of the electron also creates a further magnetic field, parallel 
to L. This subsequently interacts with the magnetic moment of the nucleus, 
resulting in a coupling of the form I- L. 


e The nucleus may have an electric quadrupole moment. This means that the 
electron no longer experiences a rotationally invariant potential. 


For most purposes, the effects due to the nuclear magnetic moment are much larger 
than those due to its electric quadrupole moment. Here we restrict attention to s-wave 
states of the electron, so that we only have to worry about the first effect above. 


To proceed, we first need a result from classical electromagnetism. A magnetic 
moment my placed at the origin will set up a magnetic field 
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B myo°(0) + (3(my - t)f — my) (3.20) 
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The second term is the long-distance magnetic field and was derived in Section 3 of 
the Electromagnetism lectures. The first term is the magnetic field inside a current 
loop, in the limit where the loop shrinks to zero size, keeping the dipole moment fixed. 
(It actually follows from one of the problem sheet questions in the Electromagnetism 
course.) 
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The electron spin interacts with this nuclear magnetic field through the hyperfine 
Hamiltonian 


AH =-m-B=<8-B 
m 


For the s-wave, the contribution from the second term in (3.20) vanishes and we only 
have to compute the first term. Writing the magnetic moments in terms of the spin, 
and using the expression (3.16) for the s-wave wavefunction at the origin, the hyperfine 
Hamiltonian becomes 
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where, in the second line, we’ve used our previous expression (3.16) for the value of the 
wavefunction at the origin, |W» .—0(0)|? = Z?/agan?, together with the usual definitions 


ao = h/ame and a = e? /4reoħc 


We see that the hyperfine splitting (3.21) has the same parametric form as the 
fine structure, with the exception that it is further suppressed by the ratio of masses 
m/M. For hydrogen with Z = 1, we should take M = m,, the proton mass, and 
m/m, ~% 1/1836. So we expect this splitting to be three orders of magnitude smaller 
than the fine structure splitting. 


We can evaluate the eigenvalues of the operator S-I in the same way as we dealt 
with the spin orbit coupling in Section 3.1.3. We define the total spin as F = S + I. 
For hydrogen, where both the electron and proton have spin 1/2, we have 


1 1 2 2 72 1 3 I ee F=0 
p29 I R2 (F° -9° -T°) Ga ) ) 5 = (3.22) 


This gives rise to the splitting between the spin up and spin down states of the electron. 
Or, equivalently, between the total spin F = 0 and F = 1 of the atom. 


The 21cm Line 


The most important application of hyperfine structure is the splitting of the 181/2 
ground state of hydrogen. As we see from (3.22), the F = 0 spin singlet state has lower 
energy than the F = 1 spin state. The energy difference is 
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This is small. But its not that small. The temperature of the cosmic microwave 
background is T ~ 2.7 K which corresponds to an energy of E = kgT ~ 3.7x10773 J > 
AEs,- This means that the hydrogen that is spread throughout space, even far from 
stars and galaxies, will have its F = 1 states excited by the background thermal bath 
of the universe. 


When an electron drops from the F = 1 state to the F = 0 state, it emits a photon 
with energy AEs yz This has frequency ~ 1400 MHz and wavelength ~ 21 cm. This 
is important. The wavelength is much longer than the size of dust particles which float 
around in space, blocking our view. This means that, in contrast to visible light, the 
21cm emission line from hydrogen can pass unimpeded through dust. This makes it 
invaluable in astronomy and cosmology. 


For example, the hydrogen line allowed us to discover that our home, the Milky way, 
is a spiral galaxy. In this case, the velocity of the hydrogen gas in the spiral arms could 
be detected by the red-shift of the 21cm line. Similarly, the 21cm line has allowed us 
to map the distribution of hydrogen around other 
galaxies. It shows that hydrogen sitting in the out- 
skirts of the galaxies is rotating much to fast to be 
held in place by the gravity from the visible mat- 
ter alone. This is one of the key pieces of evidence 
for dark matter. An example from the KAT7 tele- 
scope, a precursor to the square kilometer array, is 
shown on the right. The green contours depict the 
hydrogen, as measured by the 21cm line, stretching 


far beyond the visible galaxy. 


Looking forwards, there is optimism that the 21cm 


Figure 16: 


line will allow us to see the “dark ages” of cosmol- 
ogy, the period several hundreds of millions of years 
between when the fireball of the Big Bang cooled and the first stars appeared. 


Caesium 


Caesium has atomic number 55 and. Its nucleus has spin J = 7/2. The mixing with the 
outer electron spin results in a hyperfine splitting of the ground state into two states, 
one with F = 3 and the other with F = 4. The frequency of the transition between 
these is now used as the definition of a time. A second is defined as 9192631770 cycles 
of the hyperfine transition frequency of caesium 133. 
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3.1.7 Atoms in an Expanding Universe 


After getting our hands dirty understanding some subtleties of atomic spectra, let’s 
now waste our time doing something silly but fun.. 


The universe is expanding. We know this because galaxies get farther apart over time. 
But what does this expansion of space do to atoms? Is the electron in a hydrogen atom 
getting slowly, but inexorably, dragged away from the proton? The answer, as we shall 
see, is no. But there is some interesting, if entirely unobservable, physics involved. 


First we need a way to capture the expansion of the universe. Ultimately, this is an 
effect that should be described using General Relativity. But it turns out that there 
is a simple, Newtonian analog that can be used when the expansion is driven by a 
cosmological constant A which, happily, is the case in our current universe. In this 
case, the potential for an electron orbiting a nucleus gets an extra term, 

Zahe 1 
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The cosmological constant acts like an inverted harmonic Vir) 
oscillator. It means that, for suitable large distances, 


particles get pushed apart from each other, which is the áá 


expected effect of an expanding universe. Note that the 
additional term is proportional to m, the mass of the 
electron. This is a reflection of the equivalence principle, 


which says that gravitational forces are proportional to 
the mass of the particle. A derivation of the Newtonian . 

i i Figure 17: 
form of the cosmological constant (3.23) can be found in 


the lectures on Cosmology. 


The form of this potential is shown in the figure although, as we will soon see, this 
is not particularly to scale. Notably, there is a turning point. We’ll be careless with 
overall constants and just focus on order of magnitudes. The turning point then sits at 


Besides the dimensionless constant Za, there are two different length scales in this 
expression. The first is the Compton wavelengh of the electron, 


h 
— 10 m 
MC 
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The second is the length scale associated to the expansion of the universe 


e 27 32 


The turning point in the potential occurs at an appropriate mean of these two scales, 
which turns out to be 


r, œ 10'4 m 


This is about r, œ~ 0.01 lightyears. It is rather large, at least as far as atoms are 
concerned. 


Without doing any further calculations, we can see that the effect of the expansion 
of the universe. Needless to say, for atoms that spread to any distance r < r,, the 
expansion of the universe doesn’t play any role. That’s deeply unsurprising. And, of 
course, holds for all actual atoms. But if we take the calculation above seriously, then 
electron orbits that extend to r ~ r, would be unstable to being ripped apart from by 
the expansion of spacetime! 


What does this mean for the hydrogen atom? The Bohr radius is aọ & 5 x 1071 m 
and the wavefunction for the n™ excited state can be shown to be peaked around a 
distance ~ n?ao. All of which suggests that the first n ~ 10!” excited states are still 
exist, but after that the electron’s life gets more perilous. (See, I told you that this 
section would be slightly silly.) 


However, there is another concern. An electron bound state in the potential (3.23) 
is always susceptible to tunnelling through the barrier. This would be a quantum 
tunnelling effect on cosmological scales and result in the instability of matter. Should 
we be worried? 


This is the kind of “tunnelling out of a trap” calculation that we did in Section 
2.2.5. Following the steps that we took there, we can get an estimate for the lifetime 
of hydrogen of the form 


T ~ e/h 


Here To is the appropriate atomic time scale. As we saw earlier in this section, the 
electron in the ground state has average speed (v) = ca. It sits at a Bohr ao = h/mcaq, 
from which we can extract a time scale 


h 
To = —— ~ 2 x 107" s 
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Roughly speaking, this is the time taken for the electron to make a single orbit (ignoring 
factors of 27.) That leaves us with the exponential factor that comes from tunnelling. 
Recall that the all-important factor of S is the action 


S= T dr'\/2m(V (x') — E) 


The potential is given in (3.23). Here we should take E = $mc’a’, the ground state 
energy of hydrogen. The limits of the integral are taken between xp ~ \/h/mca and 
zı & \/Ca?/A which is where the integrand vanishes (and I’m being sloppy about 
various factors at this stage). This integral is entirely dominated by the upper limit 
and, again ignoring various factors, is given by 
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This is the ratio of an atomic scale by a cosmological scale. It’s going to be large. 
Indeed, you can check that S/A œ~ 10%°. We learn that the expected lifetime of a 
hydrogen atom, before it is unceremoniously torn apart by the expansion of the universe, 


Sx 


is roughly 
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This isn’t something that should keep you awake at night. Indeed, numbers like e10” 
are so ridiculously large that it doesn’t matter what units you measure them in: it’s 
more or less the same timescale whether you measure it in Planck units, seconds, or 
Hubble times. 


3.2 Atomic Structure 


In this section, we finally move away from hydrogen and discuss atoms further up the 
periodic table. The Hamiltonian for N electrons orbiting a nucleus with atomic number 
Z is 


N 
is Ze 1 
H= — 2 24 
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For a neutral atom, we take N = Z. However, in what follows it will be useful to keep 
N and Z independent. For example, this will allows us to describe ions. 
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We can, of course, add to this Hamiltonian relativistic fine structure and hyperfine 
structure interactions of the kind we described in the previous section. We won’t do this. 
As we will see, the Hamiltonian (3.24) will contain more than enough to keep us busy. 
Our goal is the find its energy eigenstates. Further, because electrons are fermions, we 
should restrict ourselves to wavefunctions which are anti-symmetric under the exchange 
of any two electrons. 


It is a simple matter to write down the Schrodinger equation describing a general 
atom. It is another thing to solve it! No exact solutions of (3.24) are known for 
N > 2. Instead, we will look at a number of different approximation schemes to try 
to understand some aspects of atomic structure. We start in this section by making 
the drastic assumption that the electrons don’t exert a force on each other. This is 
not particularly realistic, but it means that we can neglect the final interaction term 
in (3.24). In this case, the Hamiltonian reduces to N copies of 


R y? Ze 1 
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This, of course, is the Hamiltonian for the hydrogen atom, albeit with the proton charge 
+e replaced by Ze. And, as reviewed in Section 3.1.1, we know everything about the 
solutions with this Hamiltonian. 


3.2.1 A Closer Look at the Periodic Table 


Ignoring the interaction between electrons gives us an eminently solvable problem. The 
only novelty comes from the Pauli exclusion principle which insists that no two electrons 
can sit in the same state. The ground state of a multi-electron atom consists of filing 
the first Z available single-particle states of the hydrogen atom. 


However, as we’ve seen above, there is a large degeneracy of energy levels in the 
hydrogen atom. This means that, for general Z, the rule above does not specify a 
unique ground state for the atom. Nonetheless, when Z hits certain magic numbers, 
there will be a unique ground state. This occurs when there are exactly the right 
number of electrons to fill energy levels. Those magic numbers are: 


n l Degeneracy N 

1 0 2 2 

2 0,1 2x (1+3)=8 2+8=10 

2 | 0,1,2 2x (1+3+45)=18 2+10+18 = 28 
3 | 0,1,2,3 | 2x (1+3+5+7) =32 |24+8+4+18+32=60 


= 37 = 


This simple minded approach suggests that at the magic numbers Z = 2,10, 28,60,... 
the atoms will have a full shell of electrons. If we were to add one more electron it 
would have to sit in a higher energy level, so would be less tightly bound. We might, 
then, want to predict from our simple minded non-interacting model that atoms with 
these special values of Z will be the most chemically stable. 


A look at the periodic table shows that 2500 
our prediction is not very impressive! We T 
learn in school that the most chemically 3 — 
stable elements are the inert Noble gases = 
on the far right. We can quantify this by & : 
looking at the ionization energies of atoms : 1000 
as a function of Z, as shown on the right S 
which shows that the most stable elements H i 
have Z = 2, 10, 18, 36, 54, 86 and 118. g 

o 10 20 30 40 50 60 
We see that our non-interacting model E a 

gets the first two numbers right, but after Figure 18: 


that it all goes pear shaped. In particular, 

we predicted that Z = 28 would be special 

but this corresponds to nickel which sits slap in the middle of the transition metals! 
Meanwhile, we missed argon, a stable Noble gas with Z = 18. Of course, there’s no 
secret about what we did wrong. Our task is to find a way to include the interactions 
between electrons to explain why the Noble gases are stable. 


Before we return to the Schrödinger equation, we will build some intuition by looking 
more closely at the arrangement of electrons that arise in the periodic table. First some 
notation. We describe the configuration of electrons by listing the hydrogen orbitals 
that are filled, using the notation n#? where # is the letter (s, p, d, f, etc.) denoting 
the l quantum number and p is the number of electrons in these states. 


The electrons which have the same value of n are said to sit in the same shell. 
Electrons that have the same value of n and l are said to sit in the same sub-shell. 
Each sub-shell contains 2(/-+ 1) different states. Electrons which sit in fully filled shells 
(or sometimes sub-shells) are said to be part of the core electrons. Those which sit in 
partially filled shells are said to form the valence electrons. The valence electrons lie 
farthest from the nucleus of the atom and are primarily responsible for its chemical 
properties. 
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There are only two elements with electrons lying in the n = 1 shell. These are 
hydrogen and helium 
Z 1 2 
Element | H He 


Electrons | 1s! 1s? 


Next, the elements with electrons in the first two shells. These are 


Z 3 4 5 6 7 8 9 10 
Li Be B C N O F Ne 
[He]+ | 2st 2s? 2s?2p' 2572p? 2572p? 2572p4 2572p? 2572p° 


where the notation in the bottom line means that each element has the filled n = 1 
shell of helium, together with the extra electrons listed. We see that the atoms seem 
to be following a reasonable pattern but, already here, there is a question to answer 
that does not follow from our non-interacting picture: why do the electrons prefer to 
first fill up the 2s states, followed by the 2p states? 


The next set of atoms in the periodic table have electrons in the third shell. They 
are 
Z 11 12 13 14 15 16 17 18 
Na Mg Al Si P S Cl Ar 
[Ne]+ | 3st 3s? 3523p! 3873p? 3573p? 3s?3pt 3s°3p? 33°3p° 


where now the electrons fill the 2s?2p® states of neon, together with those listed on the 
bottom line. Again, we see that the 3s level fills up before the 3p, something which 
we will later need to explain. But now we see that it’s sufficient to fill the 3p states to 
give a chemically inert element. This suggests that there is a big energy gap between 
between 3p and 3d, again something that is not true in the absence of interactions. 


In the next row of elements, we see another surprise. We have 


Z 19 20 21 22 iat 30 31 E 36 
K Ca Sc Ti Adi Zn Ga dees Kr 
[Ar]+ | 4st! 4s? 3d!4s? 3d?4s? ... 3d04s? 3d!°4s?4p! ...  3d!04s?4p° 


We see that we fill the 4s states before the 3d states. This is now in direct contradiction 
to the non-interacting model, which says that 4s states should have greater energy that 
3d states. 
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There is a simple rule that chemists employ to 


“4 


A 
explain the observed structure. It is called the auf- a pe p 
bau principle and was first suggested by Bohr. It says e a r e 
that you should write all possible n# energy levels in ws we ae a 
a table as shown to the right. The order in which the As 4p 4d 4f 
energy levels are filled is set by the arrows: first 1s, ih 5s 5p 5d 5f 
followed by 2s, 2p, 3s, and then 3p, 4s, 3d, 4p and so 6 a 
on. This explains the observed filling above. Our task a Ts "g Tp a 

a aa 


in these lectures is to explain where the aufbau prin- 
ciple comes from, together with a number of further Rigure 10: Ania 
rules that chemists invoke to explain the elements. . 


The aufbau principle also explains why the periodic table needs those two extra lines, 
drifting afloat at the bottom: after we fill 6s (Cs and Ba) we move 4f which has 14 
states. These are elements Z = 58 to Z = 71. However, rather annoyingly, the first 
element in those bottom two lines in La with Z = 57 and this, it turns out, is an 
exception to the aufbau principle, with electron configuration [Xe]5d168s?! 


In fact, the “aufbau principle” is more an “aufbau rule of thumb”. As we go to higher 
values of Z there are an increasing number of anomalies. Some of these are hidden 
in the ... in the last table above. Vanadium with Z = 23 has electron configuration 
[Ar]3d34s?, but it is followed by chromium with Z = 24 which has [Ar]3d°4s'. We see 
that the 4s state became depopulated, with an extra electron sitting in 3d. By the 
time we get to manganese at Z = 26, we’re back to [Ar]3d°4s”, but the anomaly occurs 
again for copper with Z = 29 which has [Ar]3d!°4s'. Chemistry, it turns out, is a little 
bit messy. Who knew? 


Even scandium, with Z = 21, hides a failure of the aufbau principle. At first glance, 
it would seem to be a poster child for aufbau, with its configuration [Ar]3d'4s?. But if 
we strip off an electron to get the ion Sc*, we have [Ar]3d'4s'. Stripping off a further 
electron, Sc** has [Ar]3d'. Neither of these follow aufbau. These anomalies only get 
worse as we get to higher Z. There are about 20 neutral atoms which have anomalous 
fillings and many more ions. 


We will not be able to explain all these anomalies here. Indeed, even to derive the 
aufbau principle we will have to resort to numerical results at some stage. We will, 
however, see that multi-electron atoms are complicated! In fact, it is rather surprising 
that they can be accurately described using 1-particle states at all. At the very least you 
should be convinced that there need not be a simple rule that governs all of chemistry. 
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3.2.2 Helium and the Exchange Energy 

We're going to start by looking at the simplest example of a multi-electron atom: 
helium. This will start to give some physical intuition for the aufbau principle. It will 
also help reveal the role that the spin of the electron plays in the energy of states. 


The Ground State of Helium 


We've already discussed the ground state of Helium in Section 2.1.2 as an example of 
the variational method. Let’s first recap the main results of that analysis. 


In the ground state, both electrons sit in the 1s state, so that their spatial wavefunc- 
tion takes the form 


. Z? r/a 
U(r, r2) = Yioolti)Yr00(r2) with Viool) = 4/ — 3° es (3.25) 
0 


Here ag = 47€9h?/me? is the Bohr radius. For helium, we should pick Z = 2. 


Since the spatial wavefunction is symmetric under exchange of the particles, we 
rely on the spin degrees of freedom to provide the necessary anti-symmetry of the full 
wavefunction. The spins must therefore sit in the singlet state 


ae 7 el ne 
l0, 0) 7 (3.26) 


Computing the shift of energy is a simple application of first order perturbation theory. 


The interaction Hamiltonian is 


e? 1 


4Teo |r1 — rə] 


Hin = (3.27) 


and, correspondingly, the shift of the ground state energy is given by 


AE = e | nër |1,0,0(71)171¥1,0,0(r2) 7 


Arey lr; — ro] 


We showed how to compute this integral in Section 2.1.2 and found AE = 2Z Ry. 
This then gives a total ground state energy of Eg ~ —74.8 eV which, given the lack of 
control of perturbation theory, is surprisingly close to the true value Ey ~ —79 eV. 
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We also learned in Section 2.1.2 that we can do better using a variational ansatz. 
Although we will not employ this technique below, there is a physics lesson that it’s 
useful to highlight. In the variational method, we again work with the form of the 
wavefunction (3.25), but this time allow the atomic number Z of the nucleus to be 
our variational parameter. We found that we can achieve a lower ground state energy, 
Eo ~ —77.5 eV — one which is closer to the true value — if instead of setting Z = 2 
in the wavefunction, we take 


5 

Z=2-— 

16 
There is some physical intuition behind this result. Each electron sees the charge Z = 2 
of the nucleus reduced somewhat by the presence of the other electron. This is called 
screening and it is the basic phenomenon which, ultimately, underlies much of the 


physics of atomic structure. 


Excited States of Helium 


Let’s now extend our discussion to the first excited state of helium. From our non- 
interacting model, there are two possibilities which, as far as non-interacting electrons 
are concerned, are degenerate. These are 1s'2s! and 1s!2p'. We would like to under- 
stand which of these has lowest energy. 


In fact, there is a further splitting of each of these states due to the spin-degrees of 
freedom. To understand this splitting, we need to recall the following: 


e The Hamiltonian is blind to the spin degrees of freedom. This means that the 
wavefunction takes the form of a tensor product of a spatial state with a spin 
state. 


e Electrons are fermions. This means that the overall wavefunction must be anti- 
symmetric under exchange of the two particles. 


There are two ways to achieve the anti-symmetry: we either make the spatial wave- 
function symmetric and the spin wavefunction anti-symmetric, or vice versa. The two 
possibilities for the spatial wavefunction are 


ee E = (alts) s(r2) + valt2)¥v(r1)) 


where we’re using the notation a, b to denote the triplet of quantum numbers of (n, l, m). 
For the first excited states, we should take a = (1,0,0). Then b = (2,0, 0) for the 1512s! 
state and b = (2,1, m) for the triplet of 1s'2p! states, with m = —1, 0,1 
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The symmetric wavefunctions Ya», must be combined with the anti-symmetric spin- 
singlet (3.26) which we write as 


lab; s = 0) = Ya, + (r1, r2) ® |0, 0) (3.28) 


where |0,0) is the spin singlet state defined in (3.26). Note that we shouldn’t confuse 
the s = 0 spin with the label “s” used to denote the l = 0 atomic orbital. They are 
different! Also, I’ve been a bit lax about my notation for wavefunctions: the expression 
above should really read |ab; s = 0) = |WVas,+) Q |0,0) where the fermionic two-particle 
state |Y} has overlap Wqs4(t1,%2) = (r1, r2|Vab, +) with the position basis of two- 
particle states |r1, r2). This, more precise, notation turns out to be somewhat more 
cumbersome for our needs. 


Similarly, the anti-symmetric wavefunction must be paired with the symmetric spin 
states. There is a triplet of such states, |s = 1; ms), 


1,1) =1t)1t) o= DAN WLI HILL) (3.29) 


The total wavefunctions are again anti-symmetric, 


lab; s = 1) = Ya- (r1, r2) Q |L, Mms) Ms = —1,0,1 (3.30) 


For both Yas, and Yap — we take a to be the 1s state and b to be either the 2s or 2p state. 
The upshot of this analysis is that there are 4 possible 1s'2s! states: a spin-singlet and 
a spin-triplet. There are 12 possible 1s'2p' states: 3 spin-singlets and 9 spin-triplets, 
the extra factor of 3 coming from the orbital angular momentum m = —1,0,1. Notice 
how fast the number of states grows, even for the simplest multi-electron atom! For 
the first excited state, we already have 16 options. This fast growth in the dimension 
of the Hilbert space is one of the characteristics of quantum mechanics. 


Fortunately, we don’t have to do degenerate perturbation theory with 16 x 16 di- 
mensional matrices! The matrix elements of the interaction Hamiltonian (3.27) are 
already diagonal in the basis |ab; s} that we’ve described above already. This follows 
on symmetry grounds. The interaction Hamiltonian preserves rotational invariance, so 
the total orbital angular momentum must remain a good quantum number. Further, 
it doesn’t mix spin states and (0,0|1,m) = 0. This means that the states (3.28) and 
(3.30) are guaranteed to be energy eigenstates, at least to first order in perturbation 
theory. 


In summary, we are looking for four energy levels, corresponding to the states 
|1s'2s';s) and |1s!2p';s) where s = 0 or 1. The question we would like to ask is: 
what is the ordering of these states? 
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We can make some progress with this question without doing any calculations. The 
interaction Hamiltonian (3.27) is a repulsive potential between the electrons. Clearly 
the states with lowest energy will be those where the electrons try to stay apart from 
each other. But the anti-symmetric wavefunction Yap- has the property that it vanishes 
when rı = rg and the electrons sit on top of each other. This strongly suggests that 
Yab- will have lower energy than V,,, and, correspondingly, the spin-triplet versions 
of a state will have lower energy than the spin-singlets. 


We can see this mathematically. The energy splitting is 


i Taalri r2)|? 
=n dridro [Palri r2)" belra r2)| = Jab + Kab 
ATE lr) —Yo| 


AFap Æ 


where J,, and Kap are given by 


1 |\da(ri)bo(ra)|? + [ba(re)Po(r1)/? 


1 
Jab = e drid?ro 


ÁT Eo 2 lr1 — ro] 
1 A 2 
-_ [andr | (r1)vo(r2)| (3.31) 
AT eg [ry — rol 


where the second line follows because the integrand is symmetric under exchange rı © 
rə. Meanwhile, we have 


| ee 1 pa(ri) v5 (r2) palro) polri) + Wa (r2) 5 (81) Palri)Po(r2) 
14 T2 5) 


Ka = 


ÅT Eo [ry — ro| 
— 1 f pp air, BENE) (3.32) 
AT €p bat =~ r2| 


The contribution Jap is called the direct integral, Kap is called the exchange integral or, 
sometimes, the exchange energy. Note that it involves an integral over the position of 
the particle rı, weighted with both possible states wa(r1) and Y»(rı) that the electron 
can sit in. 


Both Jap and Kap are positive definite. This is not obvious for Kap, but is intuitively 
true because the integral is dominated by the region rı © rə where the numerator is 


approximately |tha(r)|2|u0(r)/2 
that, as expected, the spin-triplet states with spatial anti-symmetry have lower energy. 


Since the shift in energy is AEab+ = Jap + Kab we see 


We’ve learned that each of the spin-triplet states is lower than its spin-singlet 
counterpart. But what of the ordering of 1s!2s! vs 1s'2p'? For this, we have to do the 
integrals J and K. One finds that the pair of 2s energy levels have lower energy than 
the pair of 2p energy levels. This, of course, is the beginning of the aufbau principle: 
the 2s levels fill up before the 2p levels. The resulting energy levels are shown in the 
diagram. 
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Taken literally, our calculation sug- ip : 
gests that the 2s state has lower energy be- = ne ' ca ae 
cause it does a better job at avoiding the 
original 1s electron. This is misleading: 
it’s more an artefact of our (not particu- me 
larly good) perturbative approach to the 


problem, rather than a way to good de- : 1s2s of oe 
scription of the underlying physics. One i 7 A e i 
could do a better job by introducing vari- 7 ii 

ational wavefunctions, similar to those we y 

looked at for the ground state. This ap- Unperturbed f-.- 4.22. M 

proach would highlight the reason why states Figure 20: 


of higher l have higher energy. This reason 
is screening. 

As we’ve seen, excited states of helium sit in both spin-singlets and spin-triplets. 
Parity means that transitions between these two states can only occur through the 
emission of two photons which makes these transitions much rarer. The lifetime of the 
1s2s state turns out to be around 2.2 hours. This is very long on atomic timescales; 
indeed, it is the longest lived of all excited states of neutral atoms. It is said to be 
meta-stable. Before these transitions were observed, it was thought that there were two 
different kinds of helium atoms: those corresponding to spin-singlet states and those 
corresponding to spin-triplets. Historically the spin-singlet states were referred to as 


parahelium, the spin-triplet states as orthohelium. 


The punchline from the story above is that spatially anti-symmetric wavefunctions 
are preferred since these come with a negative exchange energy. The fermionic nature 
of electrons means that these wavefunctions sit in a spin-triplet states. This fact plays 
an important role in many contexts beyond atomic physics. For example, the spins in 
solids often have a tendency to align, a phenomenon known as ferromagnetism. This 
too can be be traced the exchange integral for the Coulomb repulsion between atoms 
preferring the spins to sit in a triplet state. This results in the kind of Sı - S2 spin- 
spin interaction that we met in the Statistical Physics course when discussing the Ising 


model. 


3.2.3 An Instability of (Very) Large Nuclei 

The periodic table doesn’t go on for ever. The heaviest, stable element is Bismuth-209 
with Z = 83. There are heavier elements with long lifetimes, such as Uranium-238 
with Z = 92 which has a half-life of around 4.5 billion years. But as you continue to go 
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up in atomic number, the half-lifes become much shorter. The heaviest elements with 
Z = 117 and Z = 118 have to be created artificially and have a half-life measured in 
milliseconds. 


This instability arises because the repulsive Coulomb force between protons defeats 
the attractive, but short-ranged, interaction of the strong nuclear force. The full details 
are complicated and clearly need an understanding of the strong nuclear force. 


However, there is another instability of heavy, charged nuclei that involves only 
electromagnetism and is very easy to see. This follows simply from the binding energy 
(3.3) of an electron with the nucleus, 

Fe ae 2 1 
E, = are arms where a= c x — 
2 4reoħe 137 


If this binding energy is sufficiently large, then it’s energetically preferable to pair 


produce an electron-positron pair out of the vacuum. Of course, this costs a significant 
amount of energy: it’s Fyair = 2mc’, where the factor of two is there because both 
the electron and positron must be created. But the electron can then be captured by 
the nucleus, saving F, of energy. (Admittedly, we are assuming that the nucleus has 
been stripped of orbiting electrons here so the lowest slot is not already taken.) The 
end result would be that the nucleus spits out a positron, collecting a tightly-bound 
electron. This whole process is energetically preferable if 


2 
Ei + Epir <0 > Z>- 
(a4 


The factor of 2, means that this particular instability only kicks in when Z ~ 280 which 
means that it’s not the mechanism that destabilises the heavy elements in the periodic 
table. 


3.3 Self-Consistent Field Method 


As we’ve seen from our attempts to understand helium, a naive application of pertur- 
bation theory is not particularly effective. Not only does it become complicated as the 
number of possible states grows, but it also fails to capture the key physics of screening. 


In this section, we will develop a variational approach to multi-electron atoms where, 
as we will see, the concept of screening sits centre stage. The idea is to attempt to reduce 
our multi-particle problem to a single-particle problem. But we don’t do this merely 
by ignoring the effects of the other particles; instead we will alter our Hamiltonian in 
a way that takes these other particles into account. This method is rather similar to 
the mean field theory approach that we met in Statistical Physics; in both cases, one 
averages over many particles to find an effective theory for a single particle. 


— 96 — 


3.3.1 The Hartree Method 


We start by considering a variational ansatz for the multi-particle wavefunction. For 
now, we will forget that the electrons are fermions. This means that we won’t im- 
pose the requirement that the wavefunction is anti-symmetric under the exchange of 
particles, nor will we include the spin degrees of freedom. Obviously, this is missing 
something important but it will allow us to highlight the underlying physics. We will 
fix this oversight in Section 3.3.3 when we discuss the Hartree-Fock method. 


We pretend that the electrons are independent and take as our ansatz the product 
wavefunction 


U(r, oes Ty) — Way (r1) Way (r2) aiva Way (ry) (3.33) 


Here the labels a; denote various quantum numbers of the one-particle states. We 
will ultimately see that the states a(r) are eigenstates of a rotationally invariant 
Hamiltonian, albeit one which is different from the hydrogen Hamiltonian. This means 
that we can label each state by the usual quantum numbers 


a= (i,m) 


Although we haven’t imposed anti-symmetry of the wavefunction, we do get to choose 
these quantum numbers for the states. This means that we can, for example, use this 
ansatz to look at the 3-particle states that lie in the shell 1s?2s' as an approximation 
for the ground state of lithium. 


We will view (3.33) as a very general variational ansatz, where we get to pick anything 
we like for each w,_(r). We should compare this to the kind of variational ansatz (3.25) 
where we allowed only a single parameter Z to vary. For the Hartree ansatz, we have 
an infinite number of variational parameters. 


The multi-electron Hamiltonian is 
N 
h? Ze 1 e? 1 
H = = S i 
( m ATEo ~) D 4Teo |r; — r;| 
i=l i<j 
Evaluated on our ansatz (3.33), the average energy is 
N 
h? Ze 1 
E = d? - =y — 7 Qi 
= fer ao (a aar] A) 
z l > (r) oe (r) a, lr) Ya, (r 
£ D fere E) ba, E) ba; (1) Va’) (3.34) 


ATEo jr —r'| 


i<j 


2/97 = 


The last term is an example of the kind of “direct integral” (3.31) that we met when 
discussing helium. 


To find the best approximation to the ground state within the product ansatz (3.33), 
we minimize (E) over all possible states. However, there’s a catch: the states q,(r) 
must remain normalised. This is easily achieved by introducing Lagrange multipliers. 
To this end, consider the functional 


Flv] =(E)- Daf f ar Wate)? -1) 


with e; the N Lagrange multipliers imposing the normalisation condition. 


Because the wavefunction is complex, we can vary its real and imaginary parts in- 
dependently. Since we have N independent wavefunctions, this gives rise to 2N real 
conditions. It’s not too hard to convince yourself that this is formally equivalent to the 
treating y(r) and ~*(r) as independent and varying each of them, leaving the other 
fixed. Minimizing F'|W] then requires us to solve 


SFM) mq SFY 


5a, (8) iat) 


The first of these is N complex conditions, while the second is simply the conjugate of 
the first. These N complex conditions are called the Hartree equations, 


K Ze? e? aj 
= Via Ari 2 Br ia j a con Pailt) = e705. lr) (3.35) 


2m Anreg Tr Tra 


These equations look tantalisingly similar to the Schrödinger equation. The only dif- 
ference — and it is a big difference — is that the effective potential for Ya, (r) depends 
on the wavefunctions for all the other electrons, through the contribution 


1 oe 
=A -D far dr jr — - (3.36) 


JAI 


Physically this is clear: the potential U,,(r) is the electrostatic repulsion due to all the 
other electrons. Note that each electron experiences a different effective Hamiltonian, 
with a different Ua, (r). The catch is that each of the Ya, (r) that appears in the potential 
U(r) is also determined by one of the Hartree equations. 
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The Hartree equations (3.35) are not easy to solve. They are N, coupled non-linear 
integro-differential equations. We see that there’s a certain circularity needed to get to 
the solution: the potentials U,,(r) determine the wavefunctions but are also determined 
by the wavefunctions. In this sense, the ultimate solution for U;,(r) is said to be “self- 
consistent” . 


The usual techniques that we use for the Schrodinger equation do not work for the 
Hartree equations. Instead, we usually proceed iteratively. We start by guessing a 
form for the potentials Ua, (r) which we think is physically realistic. Often this involves 
making the further approximation that U(r) is spherically symmetric, so we replace 


Unt) > Ualr) = | E Vale) 


Then, with this potential in hand, we solve the Schrodinger equations 


++ Un(0)] Halt) = ebal) (3.37) 


This can be done numerically. We then substitute the resulting wavefunctions back 
into the definition of the potential (3.36) and then play the whole game again. If we 
chose a good starting point, this whole process will being to converge. 


Suppose that we’ve done all of this. What is the answer for the ground state energy of 
the atom? From (3.35), the Lagrange multipliers €; look like the energies of individual 
particles. We can write 


h? Ze Sn — 
= 3 x Vv? = 3 7 A 
- fa P Sahid | 2m 4TEor ET = fè me — r'| eer] Post) 


Summing these gives an expression that is almost the same as the expected energy 


(3.34), except that the sum 7; >; is twice the sum )7,_,. The evaluated on the 


solutions to the Hartree equations is then given by 


MALAON 
(E) Da P dr = 


By the usual variational arguments, this gives an upper bound for the ground state 
energy. 
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An Example: Potassium 


We won't describe in detail the numerical solutions to the Hartree equations (nor to 
the more sophisticated Hartree-Fock equations that we will meet shortly). We can, 
however, use this approach to offer some hand-waving intuition for one of the more 
surprising features of the aufbau principle: why does the 4s state fill up before the 3d 
state? 


This question first arises in potassium, an alkali metal with electron configuration 
18?2s72n°3s?3p°4s!. Why is the last electron in 4s rather than 3d as the non-interacting 
picture of electrons would suggest? 


In the Hartree approach, we see that the electron experiences an effective potential 
with Schrédinger equation (3.37). The key piece of physics that determines U(r) is, 
once again, screening. When the electron is far away, the nuclear charge Ze is expected 
to be almost entirely screened by the other Z — 1 electrons. In contrast, when the 
electron is close to the nucleus, we expect that it feels the full force of the Ze charge. 
On these grounds, the total effective potential should be 


Ze? ore Zne 


4T Eor 4TEor 


where Z(r) is some function which interpolates between Z(r) => Z as r — 0 and 
Z(r) > 1 as r> oo. 


We should now solve the Schrödinger equation with this potential. All quantum 
states are labelled by the usual triplet (n, l,m), but as the potential is no longer simply 
1/r the energy levels will depend on both n and l. The basic physics is the same as 
we described for the excited states of helium. The l = 0 s-wave states extend to the 
origin which causes their energy to be lower. In contrast, the higher l states experience 
an angular momentum barrier which keeps them away from the origin and raises their 
energy. This explains why 3s fills up before 3p. But this same screening effect also 
lowers the 4s states below that of 3d. 


3.3.2 The Slater Determinant 


The Hartree ansatz (3.33) is not anti-symmetric under the exchange of particles. As 
such, it is not a physical wavefunction in the Hilbert space of fermions. We would like 
to remedy this. 


= 100 = 


Our task is a simple one: given a collection of 1-particle states, how do we construct 
a multi-particle wavefunction for fermions that are anti-symmetric under the exchange 
of any pair of particles? This general question arises in many contexts beyond the 
spectrum of atoms. 


We will use the notation |7);(j)) to mean the particle j occupies the one-particle state 
lVi). Then we can build a suitably anti-symmetrised N-particle wavefunction by using 
the Slater determinant, 


a) l2) -o N) 
1 (RD) l2) --- (N) 
Jon (1)) [bn (2)) .-- ew (NV) 
Expanding out the determinant gives N! terms that come with plus and minus signs. 


The overall factor of 1/vV N! ensures that the resulting state is normalised. The plus 
and minus signs provide the anti-symmetry that we need for fermions. In fact, we 


can see this quickly without expanding out: swapping the first and second particle is 
tantamount to swapping the first and second rows of the matrix. But we know that 
this changes the determinant by a minus sign. In particular, if two particles sit in the 
same state then the rows of the matrix become linearly dependent and the determinant 
vanishes. In this way, the Slater determinant enforces the Pauli exclusion principle. 


One can build the Slater determinant for any states |q;) which span an N-dimensional 
Hilbert space. It will be convenient to choose the states |y;) to form an orthogonal 
basis. 


An Example: Helium 


For helium, we take the set of one-particle states to be the hydrogen wavefunctions for 
Z = 2, so |Wa) = Yin m (r) ®lms) where the spin quantum number ms = +3 is usually 
replaced by the notation |$) = |f} and |—$) = |4). 


For the ground state we place both particles in the 1s state with different spins. The 
corresponding Slater determinant is 


1 [velt vil) elt) 
V2 EBI) hile |L) 


where |0, 0} is the spin-singlet state (3.26). This is the ground state of helium that we 


= Ypıs(r1)Y1s(r2) 8 |0, 0) 


used previously. 
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When particles sit in different hydrogenic states, there are more possibilities for the 
Slater determinant. For example, for the 1s'2s' excited state, there are four Slater 
determinants. Two of these sit in spin eigenstates 


1 |dis(ti) @|t) Yis(r2) @|t) 

ae = W(t, r2 
V2 | Wog(t1) @|t) bas(r2) @ |T) (1,42) ® |1,1) 
1 (Vi) @ |L) Vis) @ |L) 

y2 = W_(r),T2 1,—1 
v2 Psr) |J} Yas(r2) ® |L) l )®| ) 


where Y4(r1, r2) = Z (Vis(r1)Y2s(r2) + Yıs(r2)V2s(r1)) and |1, m} are the spin-triplet 
states (3.29). Meanwhile, the other Slater determinants are 


1 (Yit) @|T) Yis) 8 |T) 1 

dey SS Wy r1, r2 0,0 Y rı, r2 1,0 
Ala aale 7a! (r1; r2) @ |0,0) + V_(r1, r2) |1, 0)) 
ES Wis(ti) D |L) Yislrə 8 |4) at a 7 7 
A e woo ey 


We see that the Slater determinants do not necessarily give spin eigenstates. 


This is one of the short-comings of the Slater determinant. In general, one can show 
that the state |W) can always be guaranteed to be an eigenstate of angular momentum 
L, and spin S,. But it is not always an eigenstate of L? and S?. 


3.3.3 The Hartree-Fock Method 


The Hartree-Fock method is a repeat of the Hartree method, but now with the fully 
anti-symmetrised wavefunction 


[War(1)) [Pa (2)) <- Yar (N)) 


1 Ya) ea) - : [a (N)) (3.38) 


[Wax (1)) [Wan (2)) --- [Pax (N)) 


Further, we will take the quantum numbers a; to include both the (n, l, m) information 
about the orbital angular momentum state, as well as the spin degrees of freedom of 
the electron. (Had we included spin in the original Hartree ansatz, it simply would 
have dropped out of the final answer; but now that we have anti-symmetry the spin 
wavefunctions are correlated with the spatial wavefunctions.) 
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Repeating the Hartree story, we find that the average energy in the state |Y} contains 
one extra term 


TESI [eno (Ev -E oat 


So. J3 a (r) à (r r) Ya; (t r) Ya (r) 
‘ak ae r-r 


dr dr’ Pas r) à (r r’) Pa, (r r) Wa, (r) Omeistme; 


E lr —r’| 


i<j 


The last term is an exchange integral of the kind we met when discussing the helium 
atom (3.32). The delta function ensures that it only contributes if the a; and a; spin 
states coincide. 


While the direct integral clearly captures the electrostatic repulsion between elec- 
trons, it is somewhat harder to drape comforting classical words around the exchange 
term. It is a purely quantum effect arising from the Pauli exclusion principle. Nonethe- 
less, we can extract some physics from it, in particular from the fact that the delta 
function means that the exchange term lowers the energy only when spins are aligned. 
This means that, all else being equal, the spins will wish to align. This is the first of 
three Hund’s rules. (The other two describe the preferential order to fill degenerate 
states with quantum numbers L and J = L + S; we won’t discuss these second two 
rules in these lectures. ) 


In practice, this does nothing for a filled shell. In this case, half the electrons have spin 
up and the other half spin down. However, when we start to fill a shell, the exchange 
term means that it’s preferable for all the spins to point in the same direction. This 
suggests that half-filled shells should be particularly stable and the next electron to 
go in after half-filling should have a noticeably larger energy and so the atom will, 
correspondingly, have a smaller ionization energy. 


We can see evidence for this by looking again at the ionization data. The ionization 
energy does not increase monotonically between Li and Ne: there are two glitches. The 
first of these is the jump from beryllium (2s?) to boron (2s?2p') where we jump to 
another shell. The other is the jump from nitrogen (1s?2s?2p*) to oxygen (1s?2s?2p*). 
Nitrogen has a half-filled 2p sub-shell, where all three electrons have spin up to benefit 
from the exchange energy. But for oxygen one electron is spin down, and the benefit 
from from the exchange energy is less. This means that the next electron costs higher 
energy and, correspondingly, the ionization energy is smaller. The same behaviour is 
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Figure 21: Ionization data again. 


seen to disrupt the linear growth between Na and Ar. The two glitches occur between 
manganese ([{Ne]3s”) and aluminium ([{Ne]3s?3p') where we jump to the next shell, and 
between phosphorus ({Ne]3s?3p*) to sulphur ([Ne]3s?3p?) where we cross the half-filled 
sub-shell. 


The exchange energy also lies behind one of the exceptions to the aufbau principle. 
Recall that chromium has electron configuration [Ar]3d°4s' as opposed to the aufbau- 
predicted [Ar]3d‘4s*. The former configuration has lower energy because it allows all 
spins to point up and so benefits more from the exchange term. 


Minimising the energy (EY gives us N coupled equations 


_ op _ Ze? 1 U(r) w (r) = fer U(r r')y (r’) L ab (r) (3 39) 
2m AT Eg r a ai \ o? ai = Eia; : 
where U(r) is given by 


ran 


Ir-r] 


dèr 


E 


This differs from the Hartree expression (3.36) because we sum over all states }/, rather 
than ` ji: This is a simplification because it means that all electrons feel the same 
potential. However, it is also puzzling because it would appear to suggest that we need 
to include a “self-interaction” between the electrons. But this i = 7 term is an artefact 
of the way we’ve written things: it cancels the corresponding term in the exchange 
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integral, which is given by 


a; r’) Wa, (r ) 
US > Pr f A j MPE 
(ae) E Ji = rae) 


This is sometimes referred to as a non-local potential. This term does depend on 


the state a;, but only through the spin dependence. This means that each electron 
experiences one of two different exchange potentials, U;* or U/* 


The set of equations (3.39) are known as the Hartree-Fock equations. It should come 
as no surprise to learn that they are no easier to solve than the Hartree equations. 
Indeed, the presence of the exchange term makes even numerical solutions considerably 
harder to come by. Nonetheless, this scheme has some success in reproducing the 
properties of atoms observed in the periodic table, in particular the aufbau principle. 


Limitations of Hartree-Fock 


We finish with a warning. Throughout this section, we’ve used the language of one- 
particle states to describe atoms. Indeed, the basic idea that we’ve focussed on is 
that atoms are made by filling successive shells of states. This is something that is 
often taught in high school and, over time, becomes so familiar that we don’t question 
it. The Hartree-Fock method panders to this idea because it looks for states within 
the anti-symmetrised product ansatz (3.38). However, the vast majority of states in 
the Hilbert space are not of the product form and, for complicated atoms, it’s quite 
possible, indeed likely, that the true ground state is a superposition of such states. In 
this case the very language of filing shells become inappropriate since there’s no way 
to say that any electron sits in a given state. 
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4. Atoms in Electromagnetic Fields 


Our goal in this chapter is to understand how atoms interact with electromagnetic 
fields. 


There will be several stages to our understanding. We start by looking at atoms 
in constant, background electromagnetic fields. Because these fields break various 
symmetries of the problem, we expect to see a splitting in the degeneracies of states. 
The splitting of the atomic spectrum due to an electric field is called the Stark effect. 
The splitting due to a magnetic field is called the Zeeman effect. We deal with each in 
turn. 


We then move on to look at what happens when we shine light on atoms. Here the 
physics is more dramatic: the atom can absorb a photon, causing the electron to jump 
from one state to a higher one. Alternatively the electron can decay to lower state, 
emitting a photon as it falls. We will begin with a classical treatment of the light but, 
ultimately, we will need to treat both light and atoms in a quantum framework. 


4.1 The Stark Effect 


“Schrodinger applied perturbation theory to the Stark effect. It was my task 
to present his perturbation theory to the seminar, which sounded perfectly 
straightforward, and I have used perturbation theory ever since. Whether 
it is applicable or not.” 


Hans Bethe 


Consider the hydrogen atom, where the electron also experience a constant, background 
electric field. We’ll take the electric field to lie in the z direction, E = €z. The 
Hamiltonian is 

Wa e? 


La N 4.1 
w 4TEor HEE (1 


The total potential energy, V (z) = e€z — e? /4reor is sketched in the diagram. 


The first thing to note is that the potential is unbounded below as z + —oo. This 
means that all electron bound states, with wavefunctions localised near the origin, are 
now unstable. Any electron can tunnel through the barrier to the left, and then be 
accelerated by the electric field to z — —oo. However, we know from our WKB analysis 
in Section 2.2.5 that the probability rate for tunnelling is exponentially suppressed by 
the height of the barrier (see, for example, (2.30)). This means that the lowest lying 
energy levels will have an extremely long lifetime. 
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If you want some numbers, the strength of vV 
a typical electric field is around € ~ 10 eV cm". 
We know that the ground state of hydrogen is Z 
Ey ~ —13.6 eV and the Bohr radius is ag ~ 


5 x 107° em, which suggests that the typical 


electric field inside the atom is around Exton ~ 
10° eV em~, which is eight orders of magnitude 


greater than the applied electric field. On gen- 
eral, ground we expect that the tunnelling prob- 


Figure 22: 


ability is suppressed by a factor of e710. At this 
point is doesn’t really matter what our units are, 
this is going to be a very small number. The states which are well bound are stable 
for a very long time. Only those states very close to threshold are in danger of be- 
ing destabilised by the electric field. For this reason, we’ll proceed by ignoring the 
instability. 


4.1.1 The Linear Stark Effect 


We’re going to work in perturbation theory. Before we look at the hydrogen atom, here’s 
a general comment about what happens when you perturb by electric fields. Suppose 
that we have a non-degenerate energy eigenstate |W). Then adding a background, 
constant electric field will shift the energy levels by 


AE = (leE - x|y) = -P - E (4.2) 


where we have introduced the electric dipole 
P = -elpiki =-e | da x y) (4.3) 


The shift in energies is first order in the electric field and is known as the linear Stark 


effect. 


For the hydrogen atom, there is an extra complication: the states |n, l,m} are de- 
generate. The energy levels 


with Ry ~ —13.6 eV have degeneracy n? (ignoring spin). This means that we will have 
to work with degenerate perturbation theory. For the electric field E = €z, we must 
compute the matrix elements 


(n, l, m'|z|n, l, m) 
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With a large degeneracy of n?, this looks like it becomes increasingly complicated as 
we go up in energy levels. Fortunately, there is a drastic simplification. 


The first simplification follows from using the parity operator 7. Recall from Section 
1.1 that the states of the hydrogen atom transform as (1.10) 


n\n, I,m) = (—1)'In, I,m) 
from which we have 


(n,U',m'|z|n,l,m) = (—1)"" (n, l, m'|rzn|n, l,m) 


= (-1)4"41 (n, Um’ |z|n, l,m) 


This means that the matrix element is non-vanishing only if l+l is odd. From this, we 
immediately learn that the unique ground state |n = 1,0,0) does not change its energy 
at leading order. 


We can also use the fact that the perturbation commutes with L,. This means that 
mhin,l',m'|z|n, l,m) = (n,U',m'|zL,|n, l,m) 
= (n, l,m |Lzz|n, l,m) Son hi, l, m'\z|n, l,m) 


So the perturbation is non-vanishing only if m = m’. (In Section 4.3.3, we’ll see 
that electric fields in the x or y direction have non-vanishing matrix elements only if 


m = mE 1.) 


This is enough to determine the corrections to the n = 2 states. The |2, 1, +1) states 
remain unaffected at leading order. Meanwhile, the |2, 0, 0) state mixes with the |2, 1, 0) 
state. The integrals over the hydrogen wavefunctions are straightforward to evaluate 
and yield 


U = (2,0, 0|z|2, 1,0) = —3eEao 


The first corrections to the energy are then given by the eigenvalues of the matrix 


0 1 
3eEag 
1 0 


We learn that, to first order in perturbation theory, the n = 2 energy eigenstates and 
eigenvalues are given by 


2,1,41) with E= (Eo)n=2 (4.4) 
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and 


1 
ll2s0,0) + |2,1,0)) with B= (Eo)na2 + 3e€ay (4.5) 


From our general discussion above, we learn that the eigenstates |2, +) can be thought 


|2, +) = 


of as having a permanent electric dipole moment (4.3). 


For higher energy levels n > 3, we need to look at the different | quantum numbers 
more carefully. In Section 4.3.3, we will show that (n,l’,m’|z|n,1,m) is non-vanishing 
only iff’ = fe 1, 


4.1.2 The Quadratic Stark Effect 


We saw above that the vast majority of states do not receive corrections at first order 
in perturbation theory. This is because these states do not have a permanent dipole 
moment P, a fact which showed up above as the vanishing of matrix elements due to 
parity. 


However, at second order in perturbation theory all states will receive corrections. 
As we now see, this can be understood as the formation of an induced dipole moment. 


Here we focus on the ground state |1,0,0). A standard application of second order 
perturbation theory tells us that the shift of the ground state energy level is 


(1,0, a l,m 


n=2 I,m 


In fact, strictly speaking, we should also include an integral over the continuum states, 
as well as the bound states above. However, it turns out that these are negligible. 
Moreover, the summand above turns out to scale as 1/n? for large n, so only the first 
few n contribute significantly. 


The exact result is not so important for our purposes. More interesting is the para- 
metric dependence which follows from (4.6) 


AE = —4reqCE7ah 


where ČC is a number of order 1 that you get from doing the sum. For what it’s worth, 
C= 4, 


4 


=o 


The polarisation is given by 
P = -VgE (4.7) 


where Vg means “differentiate with respect to the components of the electric field” 
and the thing we’re differentiating, which is a non-bold Æ, is the energy. Note that 
for states with a permanent dipole, this definition agrees with the energy (4.2) which 
is linear in the electric field. However, for states with an induced dipole, the energy is 
typically proportional to E- E, and the definition (4.7) means that it can be written as 


1 
AE=-—-P-E 
2 
From our expression above, we see that the ground state of hydrogen has an induced 
polarisation of this kind, given by 


P = 2C x 4regagE (4.8) 


W’ve actually seen the result (4.8) before: in the lectures on Electromagnetism we 
discussed Maxwell’s equations in matter and started with a simple classical model of 
the polarisation of an atom that gave the expression (4.8) with 2C = 1 (see the start 
of Section 7.1 of those lectures.). The quantum calculation above, with 2C = 2 is the 
right way to do things. 


Degeneracies in the Presence of an Electric Field 


As we’ve seen above, only degenerate states |n, l,m’) and |n,l,m) with L = I’ and 
m = m are affected at leading order in perturbation theory. All states are affected at 
second order. When the dust settles, what does the spectrum look like? 


On general grounds, we expect that the large degeneracy of the hydrogen atom 
is lifted. The addition of an electric field breaks both the hidden SO(4) symmetry 
of the hydrogen atom — which was responsible for the degeneracy in l — and the 
rotational symmetry which was responsible for the degeneracy in m. We therefore 
expect these degeneracies to be lifted and, indeed, this is what we find. We retain the 
spin degeneracy, M, = +3, since the electric field is blind to the spin. 


There is, however, one further small degeneracy that remains. This follows from the 
existence of two surviving symmetries of the Hamiltonian (4.1). The first is rotations in 
the (x, y)-plane, perpendicular to the electric field. This ensures that [H, L,] = 0 and 
energy eigenstates can be labeled by the quantum number m. We’ll call these states 
|a;m), where a is a label, not associated to a symmetry, which specifies the state. We 
have L,|a,m) = mhla;m). 
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The second symmetry is time-reversal invariance discussed in Section 1.2. The anti- 
unitary operator © acts on angular momentum as (1.24), 


OLO! = -L 


This means that O|a;m) = |a; —m)}. Because [O, H] = 0, the states |a; m) and |a; —m) 
must have the same energy. This means that most states are two-fold degenerate. The 
exception is the m = 0 states. These can be loners. 


4.1.3 A Little Nazi-Physics History 


The Stark effect was discovered by Johannes Stark in 1913. For this he was awarded 
the 1922 Nobel prize. 


Stark was a deeply unpleasant man. He was an early adopter of the Nazi agenda 
and a leading light in the Deutsche Physik movement of the early 1930s whose primary 
goal was to discredit the Jüdische Physik of Einstein’s relativity. Stark’s motivation 
was to win approval from the party and become the Führer of German physics. 


Stark’s plans backfired when he tangled with Heisenberg who had the temerity to 
explain that, regardless of its origin, relativity was still correct. In retaliation, Stark 
branded Heisenberg a “white Jew” and had him investigated by the SS. Things came 
to a head when — and I’m not making this up — Heisenberg’s mum called Himmler’s 
mum and asked the Nazi party to leave her poor boy alone. Apparently the Nazi’s 
realised that they were better off with Heisenberg’s genius than Stark’s bitterness, and 
House Stark fell from grace. 


4.2 The Zeeman Effect 


The last entry in Michael Faraday’s laboratory notebooks describe an experiment in 
which he subjected a flame to a strong magnetic field in the hope of finding a shift 
in the spectral lines. He found nothing. Some decades later, in 1896, Pieter Zeeman 
repeated the experiment, but this time with success. The splitting of atomic energy 
levels due to a background magnetic field is now called the Zeeman effect. 


The addition of a magnetic field results in two extra terms in the Hamiltonian. The 
first arises because the electron is charged and so, as explained in the lectures on Solid 
State Physics, the kinetic terms in the Hamiltonian become 


1 Ze 
4n€9 r 


1 
H = —(p+eA)*— 


(4.9) 


== 


where A is the vector potential and the magnetic field is given by B = V x A. We 
take the magnetic field to lie in the z-direction: B = Bz and work in symmetric gauge 


A= 2 (—y, x, 0) 
2 
We can now expand out the square in (4.9). The cross terms are p: A = A-p = 
B(xpy—yp:)/2. Note that, even when viewed as quantum operators, there is no ordering 
ambiguity. Moreover, we recognise the combination in brackets as the component of 
the angular momentum in the z-direction: L, = xp, — yp,. We can then write the 
Hamiltonian as 


1 
H= (p° +eB- L +e B?’ (x? +y°)) = (4.10) 


Note that the B - L term takes the characteristic form of the energy of a magnetic 
dipole moment p in a magnetic field. Here 
B e 
HL = SrA 


is the dipole moment that arises from the orbital angular momentum of the electron. 


The second term that arises from a magnetic field is the coupling to the spin. We 
already saw this in Section 3.1.3 


AH =9 B-S 
2m 


where the g-factor is very close to g ~ 2. Combining the two terms linear in B gives 
the so-called Zeeman Hamiltonian 


e 
Hz = 7—-B. (L + 25) (4.11) 


Note that it’s not quite the total angular momentum J = L + S that couples to the 
magnetic field. There is an extra factor of g = 2 for the spin. This means that the 
appropriate dipole moment is 


e 
= —~—(L+28 4.12 
Mtotal T + ) ( ) 


The terms linear in B given in (4.11) are sometimes called the paramagnetic terms; 
these are responsible for the phenomenon of Pauli paramagnetism that we met in 
the Statistical Physics lectures. The term in (4.10) that is quadratic in B is some- 
times called the diamagnetic tem; it is related to Landau diamagnetism that we saw in 
Statistical Physics. 
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Figure 23: Splitting of the 2s and 2p energy levels in a magnetic field. The quantum numbers 


|m, Mms) are shown. 


In what follows, we will work with magnetic fields that are small enough so that we 
can neglect the diamagnetic B? term. In terms of dimensionless quantities, we require 
that eBaj/h < 1 where ao, the Bohr radius, is the characteristic size of the atom. In 


practical terms, this means B < 10 T or so. 


4.2.1 Strong(ish) Magnetic Fields 


We work with the Zeeman Hamiltonian (4.11). It turns out that for the kinds of 
magnetic fields we typically create in a lab — say B S 5 T or so — the shift in 
energy levels from Hz is smaller than the fine-structure shift of energy levels that we 
discussed in Section 3.1. Nonetheless, to gain some intuition for the effect of the Zeeman 
Hamiltonian, we will first ignore the fine-structure of the hydrogen atom. We’ll then 
include the fine structure and do a more realistic calculation. 


We want to solve the Hamiltonian 
1 1 Ze e 
H = Ho + Hz = — V’? — 
mig 2m Teo r 2m 


We start from the standard states of the hydrogen atom, |n, l, mı, Mms} where now we 


B- (L+28) (4.13) 


include both orbital angular momentum and spin quantum numbers. The energy of 
these states from Ho is Eo = —Ry/n? and each level has degeneracy 2n?. 


Happily, each of the states |n, l, mı, Mms) remains an eigenstate of the full Hamiltonian 
H. The total energy is therefore Æ = E)+ Ez, where the Zeeman contribution depends 
only on the m; and m, quantum numbers 


h 
(Ez)mjm, = (n, l, m, ms|Hz]n, l,m, ms) = (m + 2m,)B (4.14) 
m 


This gives our desired splitting. The two 1s states are no longer degenerate. For the 
n = 2 states, the splitting is shown in the figure. The 2s states split into two energy 
levels, while the six 2p states split into five. Note that the mı = 0 states from 2p are 
degenerate with the 2s states. 
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As we mentioned above, the energy spectrum (4.14) holds only when we can neglect 
both the fine-structure of the hydrogen atom and the quadratic B? terms. This restricts 
us to a window of relatively large magnetic fields 5 T < B < 10 T. The result (4.14) is 
sometimes called the Paschen-Back effect to distinguish it from the weak field Zeeman 
effect that we will study below. 


The states |n, l, mı, Mms} are eigenstates of the full Hamiltonian (4.13). This means 
that we could now consider perturbing these by the fine-structure corrections we met 
in Section 3.1 to find additional splitting. 


4.2.2 Weak Magnetic Fields 


When the magnetic fields are small, we have to face up to the fact that the fine-structure 
corrections of Section 3.1 are larger than the Zeeman splitting. In this case, the correct 
way to proceed is to start with the fine structure Hamiltonian and then perturb by Hz. 


Because of the spin-orbit coupling, the eigenstates of the fine structure Hamiltonian 
are not labelled by |n, l, mı, ms}. Instead, as we saw in Section 3.1.3, the eigenstates 
are 


In, j,m53 l) 


where j = |l + 3| is the total angular momentum, and the final label / is not a quantum 
number, but is there to remind us whether the state arose from j = l + z or j=l — F, 
The upshot of our calculations in Sections 3.1.2 - 3.1.4 is that the energies depend only 
on n and j and, to leading order, are given by 


1 3 2 1 
En -=Z 2 2 (ea! Z 2 — 
PAUA ( 2n” FA (3 2j + :) =) 


We now perturb by the Zeeman Hamiltonian Hz given in (4.11) to find, at leading 


order, the shifts of the energy levels given by 


AE = Z nj, mj; UL, + 25,|n, j, mj; 0) (4.15) 
You might think that we need to work with degenerate perturbation theory here. In- 
deed, the existence of degenerate states with energy En, means that we should allow 
for the possibility of different quantum numbers m; and I’ on the state (n, j, m4; l'|. 
However, since both [L?, Hz] = 0 and [J,, Hz] = 0, the matrix elements vanish unless 
[=I and mj = m}. Fortunately, we again find ourselves in a situation where, despite 
a large degneracy, we naturally work in the diagonal basis. 


—114- 


As we will now see, evaluating (4.15) gives a different result from (4.14). Before 
proceeding, it’s worth pausing to ask why we get different results. When the magnetic 
field is weak, the physics is dominated by the spin-orbit coupling L -S that we met 
in Section 3.1.3. This locks the orbital angular momentum and spin, so that only the 
total angular momentum J = L+S sees the magnetic field. Mathematically, this means 
that we use the states |n, j, mj; l) to compute the energy shifts in (4.15). In contrast, 
when the magnetic field is strong, the orbital angular momentum and spin both couple 
to the magnetic field. In a (semi-)classical picture, each would precess independently 
around the B axis. Mathematically, this means that we use the states |n, l, m, Mms} to 
compute the energy shifts in (4.14). 


Let’s now compute (4.15). It’s a little trickier because we want the z-components of 
L and S while the states are specified only by the quantum numbers of J. We’ll need 
some algebraic gymnastics. First note the identity 


ihS x L=(L-S)S—S(L-S) (4.16) 
which follows from the commutators [5;, Sj] = ihe;;,S, and [L;,.5;] = 0. Further, since 
2L-S = J? — L? — S?, we have [L- S, J] = 0, which means that we can take the cross 
product of (4.16) to find 

ih(S x L)xJ=(L-S)SxJ-SxJ(L-S) 


But, by standard vector identities, we also have 


(Sx L)xJ=L(S-J)-—S(L- J) 


where, in the second line, we have simply used L = J — S. Putting these two together 
gives the identity 


(L-S)SxJ-SxJ(L-S) = ih(J(S - 3) - $(0)) (4.17) 


Finally, we again use the fact that 2L- S = J? — L? — S? to tell us that L- S is diagonal 
in the basis |n, j, m;;l)}. This means that the expectation value of the left-hand side 
of (4.17) vanishes in the states |n, j, m;; l}. Obviously the same must be true of the 
right-hand side. This gives us the expression 


(n, j, mj; IS(I’)]n, j, mj; 1) = (n, j, my; I(S - J) In, j, mj; 1) 
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Using 2(S- J) = J? + S? — L?, we then find that 


jli +1) +s(s +1) -—l +1) 
2j(j + 1) 


(n, j, mj; 1|S|n, j, mj; l) = (n, j, mj; LJ |n, j, my; 1) 


This is the result we need. Using L = J — S, the shift in energy levels (4.15) can be 
written as 


B 


2m 
eB f : ehB 
= S—(mjh+ (n, j,my; 11S. In, j, my;1)) = m a (4.18) 


where gy is known as the Landé g-factor, and is the ratio of angular momentum quantum 
numbers given by 


jli +1) + s(s+1) —l(l+1) 
29(9 + 1) 


gy = 14 


It is a number which lies between 1 and 2. 


We see that our final answer (4.18) for the Zeeman splitting is rather simple. Indeed, 
it’s the answer we would expect for a magnetic dipole of the form, 


egJ 
= J 
2m 


My (4.19) 
We see here the effect of the spin-orbit interaction. As explained above, it locks the spin 
and angular momentum together into the total angular momentum J. This changes 
the dipole moment from (4.12) to this result. 


The splitting of atomic energy levels allows us to see magnetic fields from afar. For 
example, we know the strength of magnetic fields in sunspots through the Zeeman 
splitting of the spectral lines of iron. 


As the magnetic field is increased, the Zeeman interaction becomes increasingly com- 
petitive with the spin orbit coupling, and we must interpolate between (4.19) and the 
Paschen-Back effect (4.12). With no hierarchy of scales, life is more complicated and 
we must treat both Hz and the fine-structure Hamiltonian separately. In practice, it 
is difficult to reach magnetic fields which dominate the spin-orbit interaction. 


However, the discussion above also holds for the hyperfine interaction, whose energy 
splitting is comparable with magnetic fields that we can achieve in the lab. In this case, 
the total angular momentum is F = J + I with I the spin of the nucleus. Including 
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the hyperfine interaction between the electron and nuclear spins, it is not hard to show 
that the magnetic moment of the atom becomes 


where 


4.2.3 The Discovery of Spin 


The suggestion that the electron carries an intrinsic angular momentum — which we 
now call spin — was first made by the Dutch physicists Samuel Goudsmit and George 
Uhlenbeck in 1925. At the time, both were students of Ehrenfest. 


With hindsight, there was plenty of evidence pointing to the existence of spin. As 
we’ve seen in these lectures, the electron spin affects the atomic energy levels and 
resulting spectral lines in two different ways: 


e Spin-Orbit Coupling: This is particularly prominent in sodium, where the exis- 
tence of electron spin gives rise to a splitting of the 3p states. The transition 
of these states back to the 3s ground state results in the familiar yellow colour 
emitted by sodium street lights, and was long known to consist of two distinct 
lines rather than one. 


e Zeeman Effect: The magnetic field couples to both the orbital angular momentum 
and to the electron spin. If the angular momentum is quantised as l € Z, we would 
expect to see a splitting into (2l + 1) states, which is always an odd number. 
However, it was known that there are atoms — such as hydrogen — where the 
splitting results in an even number of states. Historically this was referred to as 
the anomalous Zeeman effect, reflecting the fact that no one could make sense 
of it. We now know that it arises because the electron spin is quantised as a 
half-integer. 


On the more theoretical level, in early 1925 Pauli proposed his famous exclusion prin- 
ciple for the first time. He employed this to explain the structure of the periodic table, 
but it only worked if the electrons had four quantum numbers rather than three — 
what we now call n, l, m and ms. 
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Despite these many hints, the proposal of Goudsmit and Uhlenbeck was not greeted 
with unanimous enthusiasm. Pauli was particularly dismissive. Lorentz, mired in a 
classical worldview, argued that if the electron was spinning like the Earth then its 
surface would have to be travelling faster than light. Indeed, a few months previously 
Kronig had privately considered the possibility of an electron spin, but had been talked 
out of it by these great minds. 


One key reason for the skepticism lay in the initial difficulty of reconciling the spin- 
orbit and Zeeman effects: if you get the Zeeman splitting right, then the fine-structure 
splitting is off by a factor of 2. Here is what Goudsmit had to say? 


“The neat day, I received a letter from Heisenberg and he refers to our 
*mutige Note” (courageous note). I did not even know we needed courage 
to publish that. I wasn’t courageous at all.... He says: ” What have you done 
with the factor 2?” Which factor? Not the slightest notion. 


Of course, we ought to have made a quantitative calculation of the size of 
the splittings...We did not do that because we imagined it would be very 
difficult...We didn’t know how to do it, and therefore we had not done it 
Luckily we did not know, because if we had done it, then we would have run 
into an error by a factor of 2” 


This was only resolved a year later when Thomas discovered the relativistic effect that 
we now call Thomas precession. As we saw in Section 3.1.3, this changes the magnitude 
of the spin-orbit coupling by the necessary factor of 2. It was only with this addition 
to the theory that everything fitted and the spin of the electron became generally 
accepted. 


The intrinsic spin of the electron is one of the most important discoveries in atomic 
and particle physics. It was ultimately explained by Dirac as a consequence of special 
relativity. For this Dirac was awarded the Nobel prize. For Goudsmit and Uhlenbeck, 
there was no such luck. Instead, in 1927, they were awarded their PhDs. 


4.3 Shine a Light 


In this section we look at what happens if you take an atom and shine a light on it. 
We'll continue to treat the electromagnetic field as classical. Ultimately we’ll see that 
this approach has shortcomings and in later sections we'll consider both the atom and 
the light to be quantum. 


3You can read the full, charming, speech at http: //lorentz.leidenuniv.nl/history /spin/goudsmit.html. 
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A monochromatic light wave is described by oscillating electric and magnetic fields, 
. 1 & ; 
E = Fp éte) and B= ~(k x Eo) cilk-x—wt) 
c 


with w? = c*k?. The wavelength of the light is \ = 27/k = 2rc/w. We will require 
that: 


e The wavelength is much larger than the size of the atom: >> ao. This means that 
the electron does not experience a spatial gradient in the electric and magnetic 
fields; only a temporal change. 


e The wavelength is tuned to be close to the energy transition between two atom 
states. For simplicity, we will focus on the ground state and first excited state. 
We then require w © wo where hwy = (E> — E1). This condition will allow us to 
restrict our attention to just these two states, ignoring the others. 


Note that the second condition is compatible with the first. A typical energy 
level of hydrogen corresponds to a wavelength À ~ 27a,9/a, so the factor of 
a & 1/137 gives us a leeway of couple of orders of magnitude. 


Shining a light means that we perturb the atom by both an electric and magnetic 
field. We know from Sections 4.1 and 4.2 that the typical energy shift in the linear 
Stark effect is AE ~ e€ay ~ e€h/mca, while the typical energy shift in the Zeeman 
effect is AF ~ eBh/2m ~ eEh/2mc. We see that the effects of the electric field are 
larger by a factor of 1/a. For this reason, we neglect the oscillating magnetic field in 
our discussion and focus only on the electric field. 


Because À >> ao, we can treat the electric field a time-dependent, but spatially 
uniform. We describe such a field by a potential ¢ = E- x, with A = 0. This means 
that the full Hamiltonian is H = Hj + AH(t), where the time-dependent perturbation 
is given by 


AH (t) = eEo - x cos(wt) 


Our goal is to find the eigenstates of the time-dependent Hamiltonian. This is a straight- 
forward exercise. 


4.3.1 Rabi Oscillations 


By construction, we will only consider two states, |W) and |wW2), obeying 


Holy) = Eily) 


== 


Within the space spanned by these two states, the most general ansatz is 
|W (t)) = cx (t)e pa) + calte a) 


with |c:|?+|c2|? = 1. We substitute this into the time-dependent Schrödinger equation, 


OV) 
ih = (Ho + AH(t))|V) 


to get 
iheye PI aby) + thee 4" abo) = ce "AH |i) + coe "AF Ya) 


Now we take the overlap with (yı| and (w2| to find two, coupled differential equations 


ci (Yp AHY) + cap [AH pa) 
ihe, = cı (Y| AH |p) + co (p| AH |Y) 


they 


where 
hwo = Es = E 


Our next task is to compute the matrix elements (q;|AH|w;). The diagonal matrix 
elements are particularly simple 


(| AH |w;) = eEo - (Wi|x|¥;) cos(wt) = 0 


These vanish because each |w;) is a parity eigenstate and these are sandwiched between 
the parity-odd operator x. This is the same argument that we used in Section 4.1 to 
show that the linear Stark effect vanishes for nearly all states. 


The off-diagonal matrix elements are non-vanishing as long as |W) has opposite 
parity to |w2). We define the Rabi frequency Q as 


AQ = eEo r (W1|x|W2) (4.20) 


Note in particular that the Rabi frequency is proportional to the amplitude of the 
electric field. We’re left having to solve the coupled differential equations 


—iwot 


iċ} = Ncos(wt) e~ cs 


+iwot 


iċ = Ncos(wt) ec, 
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In fact, there is one further simplification that we make. We write these as 
O a , 
ic, = a (gem i et two)t) Co 
ee Q —i(w—wo)t i(w+wo)t 
ia =F (e +e ya (4.21) 


The right-hand side of each of these equations has two oscillatory terms. Recall, how- 
ever, that we required our frequency of light to be close to the atomic energy splitting 
Wo. This means, in particular, that 


lw — wo] & w+ wo 


So the second terms in (4.21) oscillate much faster than the first. We are interested 
only in the behaviour on long time scales — comparable to |w—w|~! — over which the 
fast oscillations simply average out. For this reason, we neglect the terms proportional 


to eilw+wo)t. 


This is known as the rotating wave approximation, even though it’s not 
obvious that it has anything to do with rotating waves! (For what it’s worth, the name 
comes from nuclear magnetic resonance where a similar approximation means that you 
keep the wave which rotates in the same way as a spin and throw away the wave which 


rotates in the opposite direction.) 
Invoking the rotating wave approximation, our equations simplify to 


ô 


. Q 
iċ = ze teo and it = xo (4.22) 


where ô = w — wo tells us how much the frequency of light w differs from the natural 
frequency of the atomic energy levels wọ. 


Resonance 


We start by considering the case 6 = 0, so that energy of light coincides with that of 
the level splitting. In this case the equations (4.22) are particularly simple: they are 
equivalent to the familiar second order differential equation 


„o Q E E Qt do = —isin ( @% 
Cj = A Ci Cy = COS 5) an C2 = — SIN 5) 


where we picked initial conditions so that we sit in the ground state |W) = |y,) at time 
t=0. 


We see that something lovely happens. The atom oscillates between the ground 
state and the first excited state with frequency Q. This phenomena is known as Rabi 
oscillations or, sometimes, Rabi flopping. 


== 


The probability that the atom sits in the excited state at time t is given by P2(t) = 
|c2|? = sin? (Qt/2). This means that if we start with the atom in the ground state and 
shine a pulse of resonant light for a time T = 7/Q then the atom will definitely be in 
the first excited state. This is known as a 7-pulse. 


OTT 


Alternatively, we could act with a “-pulse”, shining resonant light for a time T = 
m/2Q. This leaves the atom in the superposition |W) = (| ,) — i|y2))/V2. This allows 
us to experimentally create superpositions of states. 

Off-Resonance 


When the incident light is detuned from resonance, so 6 Æ 0, the first order equations 
(4.22) can be combined into the second order differential equation for cı 


da ar dc g 


dt2 a oe 
d 6 iVP4P\ (a i iwWOFP\) _. 
di 2 2 d ? 9 aS 


This has the solution 


c (t) = eidt/2 


./Q2 2 O2 1 s2 
Acos ( = ‘ + Bsin (EE) 


We’ll again require that all the particles sit in the ground state |1} at time t = 0. 
This fixes A = 1 but this time we don’t have B = 0. Instead, we use the first of the 
equations (4.22) to determine c2 and require that co(t = 0) = 0. This gives the solution 


Q =e Cos t —t 


5 (ao sin 5 


and 


Ca = —ie 4/2 L sin aes ot 
a VR +e 2 


We see that the oscillations now occur at the generalised Rabi frequency VQ? + 6?. 
This means that as we detune away from resonance, the oscillation rate increases. The 
probability of sitting in the excited state is now 


Ca (= =) 


P(t) = |ca(t)|? (4.23) 


= Q2 4 52 T 52 sın 5) 


= 22s 


We see that, for 6 4 0, this probability never reaches one: we can no longer be certain 
that we have excited the atom. However, the Rabi frequency 2 is proportional to the 
amplitude of the electric field (4.20). This means that as we increase the intensity of 
the electric field, the probability of excitation increases. In contrast, for very weak 
electric fields we have 6 >> Q and the probability never gets above 0?/6?, 


OF og (Ot 
P(t) ~ Rp sin? (=) (4.24) 


Electric Dipoles vs Magnetic Dipoles 


Our discussion above describes transitions between states that are driven by the oscil- 
lating electric field. These are called electric dipole transitions. 


However, there are also situations where the oscillating magnetic field dominates the 
physics. This occurs, for example, in fine structure and hyperfine structure transitions, 
both of which involve flipping a spin degree of freedom. The theory underlying these 
transitions is the same as we described above, now with a Rabi frequency given by 


AQ = B - (yi lajpa) 


where u is the atomic magnetic moment. Such transitions are called magnetic dipole 
transitions. 


The oscillatory behaviour described above was first observed in hyperfine transitions. 
For this Isador Rabi won the 1944 Nobel prize. 


4.3.2 Spontaneous Emission 


Take an atom in an excited state, place it in a vacuum, and leave it alone. What 
happens? If we model the atom using the usual quantum mechanical Hamiltonian for 
the electrons orbiting a nucleus, then we get a simple prediction: nothing happens. 
Any quantum system when placed in an energy eigenstate will stay there, with only its 
phase oscillating as e~*”¢/". 

Yet in the real world, something does happen. An atom in an excited state will 
decay, dropping down to a lower state and emitting a photon in the process. This is 
called spontaneous emission. This is not a process which happens deterministically. 
We cannot predict when a given atom will decay. We can only say that, on average, 
a given excited state has a lifetime 7. We would like to know how to calculate this 
lifetime. 
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How can we describe spontaneous emission in quantum mechanics? It is difficult 
because we need a framework in which the number of particles changes: before the 
decay, we have just the atom; after the decay we have both the atom and the photon. 
To model this properly we need to understand how to treat the electromagnetic field 
in a manner consistent with quantum mechanics. This is the subject of quantum field 
theory. We will make baby steps towards this in Section 4.4. 


However, it turns out that there is a clever statistical mechanics argument, originally 
due to Einstein, that allows us to compute the lifetime 7 of excited states without using 
the full framework of quantum field theory. We now describe this argument. 


Rate Equations 


Consider a large number of atoms. We start with N; in the ground state and Np» in 
the excited state. Each of these excited atoms will spontaneously decay to the ground 
state with a rate that we call A21. We model this with the rate equation 


dN» 
— = — Åz N: 4.25 

di a1lVo (4.25) 
The solution tells us that the population of excited atoms decays with a characteristic 
exponential behaviour, with lifetime 7 defined as 


1 
N2(t) = N2(0)e"/" with T= — (4.26) 
Apt 
Our ultimate goal is to compute A1. To do this, we will take the unusual step of 
making the situation more complicated: we choose to bathe the atoms in light. 


The light gives rise to two further processes. First, the ground state atoms absorb 
light and are promoted to excited states. This happens at a rate which is proportional 
to the intensity of light, p(w). Furthermore, as we saw above, the dominant effect 
comes from the light which is resonant with the energy difference of the atomic states, 


Ey — By 
W = wo = ——— 
h 


We call the total rate for the ground state to be excited to the excited state p(wo) Bi. 


(There is a slight subtlety here: the rate actually gets contributions from all frequencies, 
but these are absorbed into the definition of B12. We’ll see this in more detail below.) 
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The second process is a little counter-intuitive: the excited states can receive extra 
encouragement to decay to the ground state from the incident light. This process, 
known as stimulated emission. It too is proportional to the intensity of light. We 
denote the rate as p(wo) Ba. If you’re suspicious about this effect, you can alway view 
By, as an extra parameter which could plausibly vanish. However, we'll see that one 
outcome of the argument is that Bj; # 0: the phenomenon of stimulated emission is 
necessary on consistency grounds. 


The net effect of bathing the atoms in light is that the rate equation (4.25) becomes 


dN: 
ae = plwo)(Bi2Nı n Ba No) — Ag No 
There is a similar equation for the population of ground state atoms 
dN, 
dt 
The coefficients A21, Bo; and By, are called the Einstein A and B coefficients. 


= —p(wo)(Bi2Ni — Boi N2) + Aoi N2 


In equilibrium, the populations are unchanging. In this case, the density of light of 
frequency Wo must be given by 


Aza No 


pluo) = SN, — BaN; e 


Throwing in Some Thermodynamics 


At this point, we look at the problem from the more microscopic perspective of statis- 
tical mechanics. (See the lecture notes on Statistical Physics for the necessary back- 
ground.) Before we proceed, we need to specify more information about the atom. We 
denote the degeneracy of the ground states, with energy Æ, and gı and the degeneracy 
of excited states, with energy Es, as go. 


We now assume that the whole atom/light mix sits in thermal equilibrium at a 
temperature 7. Then the Boltzmann distribution tells us that the relative population 
of atomic states is given by 


_Fa/kpT 
Nə gee — 92 .—hwo/keT 


Ni gie-Ev/FaT ~ gy, 


Furthermore, the energy density of light is given by the Planck distribution, 


h w? 


p(w) = n20 eluw/keT —] (4.28) 
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Combining these formulae with our previous result (4.27), we find the result 


h we 7 Ax 
T2c3 elwo/kpT -1 Byo(gi/g2)eho/ket — Boy 


p(wo) = 


We want this equation to hold for any temperature T. This is a strong requirement. 
First, it relates the absorption and stimulated emission coefficients 


nBi2 = 92Bo1 (4.29) 


We see that, as promised, it is a thermodynamic requirement that stimulated emission 
occurs if absorption can occur. More surprisingly, we also get a relationship between 
the rates for stimulated emission and spontaneous emission 


hwg 
Az = TE Bə (4.30) 


This is a remarkable result. All information about the temperature of the background 
light bath has dropped out. Instead, we are left with a relationship that only depends 
on the inherent properties of the atom itself. Furthermore, the probability for an atom 
to decay in vacuum is related to the probability for it to decay when bombarded by 
light. 


Computing the Einstein Coefficients 


If we know one of the three Einstein coefficients, then the relations (4.29) and (4.30) 
immediately give us the other two. But we have already computed the probability for 
an atom to be excited in Section 4.3.1 in the context of Rabi oscillations. 


We still need to do a little work to translate between the two results. In the limit of 
weak electromagnetic fields, the probability to excite the ground state by shining light 
of frequency w was given in (4.24) 


If we take the electric field to be Ey = (0,0, E), then the (square of the) Rabi frequency 
given by (4.20) 


eE? 
B 


In thermal equilibrium we have photons of all frequencies w, whose energy distribution 
is governed by the blackbody formula (4.28). This means that we have electric fields 


Q’ [lela 
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E of all frequencies. Recall that the energy density p(w) stored in an electric field is 
€yE*/2. Integrating over frequencies, the probability to sit in the excited state is 


Pal) = Saltele fao PEL sin? (Eo) 


This integral is dominated by the region near w = wo. We therefore replace p(w) by 
p(wo) and bring it outside the integral, 


P(t) ~ Ze Pleo) lela Ë faw gopi (=) 


Note that this step ensures that the rate is indeed proportional to p(wo), which was an 
assumption in deriving our rate equations above. Finally, to do the integral we write 
x = (w — wo)t/2 and extend the range from —oo to oo, 


2e? t [T9 sin? r 
PAD & E pwl d 


2 


2 
a Pwo) Ilaley) t 


The fact that the probability grows linearly with t is an artefact of the approximation 
above. The answer is correct only for small t. The real lesson to take from this is that 
the rate P(t) is given by 
en 
Rate of Absorption = P(t) = che p(wo)| (v1 |z|a2) | 


from which we get the Einstein coefficient 


er 
Bu = |(vileb) 


Finally, since the light is bombarding the atom from all directions, this is often written 
using rotationally invariant matrix elements, 
2 
en 


= Behe | (aba [x|2h2) |? (4.31) 


Using the Einstein relations (4.29) and (4.30), we see that the smaller the matrix 
element, the longer lived the particle. 


By 


4.3.3 Selection Rules 


What happens if the matrix element (4.31) vanishes? In this case the excited state does 
not decay when subjected to oscillating electric fields: it is stable against electric dipole 
transitions. The fact that some transitions are forbidden is referred to as selection rules. 
This doesn’t mean that these excited atomic states are fully stable because there can 
still be other decay channels as we explain below. 
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We have already seen situations where (w;|x|W2) vanishes when discussing the Stark 
effect. Because x is parity odd, the two states must differ in parity. However, there 
are more stringent selection rules than those that follow from parity alone. Here we 
recapitulate and extend these results. 


First, an obvious point. The operator x knows nothing about the spin of the states, 
so |y) and |w2) must have the same spin. We write this as the requirement 


As = Am, = 0 


More powerful selection rules come from looking at the other angular momentum quan- 
tum numbers. Neglecting spin, the atomic states |W) are labelled by |n, l,m}. Using 
[L., z] = 0, we have 


(n’,U',m'|[Lz, z]|n, l,m) = Afm — m)(n',l',m'|z|n, l,m) = 0 


This tells us that electric fields which oscillate in the z-direction can only effect a 
transition if m = m’, or 


Am = 0 for light polarised in the z direction 
However, we also have [L;, x + iy] = +h(x + iy) which tells us 


(n',U,m'|[L,, x + iy]|n, l,m) = A(m' — m)(n'U’,m'|x + ty|n, l, m) 
= +A(n',l',m'|x + iy|n, l,m) 


This tells us that electric fields oscillating perpendicular to the z-direction can only 
effect a transition when m’ — m = +1, or 


Am = +1 for light polarised transverse to the z direction 


To determine the allowed transitions between l quantum numbers, we use the identity 
[L?, [L?, x]] = 2h?(xL* + L?x), which gives us 


(n’,U,m'|L?, [L?, x]]|n, l,m) = WRU +1) — 1b +:1))?(n' l, m'|x|n, l,m) 
= R (UU +1) + ULH Dw, l, m'|x|n, l,m) 


Rearranging and factorising, we have 
(HOUHAU L? = 1), l, m'|x|n, l,m) = 0 
Since l,l > 0, we learn that this matrix element is non-vanishing only if l— l = +1, or 


Al = +1 
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We've derived each of these selection rules by pulling a commutation relation identity 
out of thin air and then seeing that it happens to give the right answer. This feels a little 
like a trick. A much more systematic approach is to invoke the Wigner-Eckart theorem, 
which tells us what matrix elements are non-vanishing based on the representation 
theory of the rotation group. 


An example of an electric dipole transition consistent with these selection rules is the 
2p — 1s decay of hydrogen. It is a simple matter to compute this using the formulae 
above: one finds a lifetime 7 œ~ 107° seconds. In contrast, the 2s —> 1s transition is 
forbidden by the selection rule Al = +1. The decay does eventually happen, but has 
to find another route. (It turns out that it primarily emits two photons rather than 
one). Correspondingly, the lifetime is much longer, r © 107} seconds. 


There’s a cute piece of physics here related to the Stark effect. Recall from Section 
4.1 that a constant background electric field causes the 2s state of hydrogen to mix 
with the 2p state. (See equation (4.5).) But, when combined with the phenomena of 
spontaneous emission, this state immediately becomes more unstable. This means that 
we can create a gas of hydrogen atoms in the 2s state, comfortable in the knowledge 
that they will last a relatively long time (around a tenth of a second). But when 
subjected to a constant electric field, they will immediately decay to the ground state, 
releasing a burst of light. 


Magnetic Dipole Transitions 


The selection rules described above hold for electric dipole transitions. However, if the 
matrix elements vanish it does not mean that the excited state of the atom is absolutely 
stable. To paraphrase Jeff Goldblum, Nature will find a way. There are other channels 
through which the atom can decay. Indeed, we already briefly described the magnetic 
dipole transition, in which the relevant matrix element is 


(Y1|H|W2) 


Here the selection rules are different. In particular, m is related to the angular momen- 
tum operator and is parity even. This means that, in contrast to the electric dipole 
transition, the matrix element above is non-vanishing only if |W) and |~2) have the 
same parity. For example, transitions between levels spit by fine structure or hyperfine 
structure have the same parity and so occur through magnetic dipole effects. 


The lifetime of any excited state is determined by the largest matrix element. Some- 
times, even the largest matrix element can be very small in which case the atomic state 
is long lived. An extreme example occurs for the hyperfine structure of hydrogen, which 
gives rise to the 21 cm line: its lifetime is around 10 million years. 


== 


4.4 Photons 


The relationship (4.29) and (4.30) have allowed us to determine the rate of spontaneous 
emission of a photon. But it’s clear the argument relied on the magic of thermodynam- 
ics. To go beyond this description, we need a way to incorporate both the quantum 
state of the atom and the quantum state of the electromagnetic field. This is the frame- 
work of Quantum Field Theory. We will see how to quantise the electromagnetic field 
in next year’s Quantum Field Theory lectures. Here we offer a baby version. 


4.4.1 The Hilbert Space of Photons 


The quantum state of the electromagnetic field is described by how many photons 
it contains. Each photon is a particle of light. Its properties are described by two 
quantum numbers. The first is the momentum, which is given by p = hk. Here k 
is the wavevector and its magnitude, k = |k|, is the wavenumber; it is related to the 
wavelength by À = 27/k and to the frequency by w(k) = kc. . The energy of a photon 
is given by the famous formula 


E = hw (4.32) 


Note that, when combined with the definition of momentum, this is simply the rela- 
tivistic dispersion relation for a massless particle: E = pc. 


The second property of the photon is its polarisation. This is described by a vector 
which is orthogonal to k. For each k, we define a two-dimensional basis of polarisation 
vectors eg, with A = 1,2, obeying 


ei -k=0 
To specify the state of the electromagnetic field, we need to say how many photons 


it contains, together with the information k and e; for each photon. The states are 
therefore labelled by a list of non-negative integers, 


l{nka}) 


where ngka € Z tells us how many photons we have with momentum k and polarisation 


À. 


We start with the state with no photons. This is the vacuum state and is denoted 
as |0). The key to quantum field theory is to view the particles — in this case, the 
photons — as excitations of the underlying field, in much the same way that the states 
of the harmonic oscillator arise from exciting the vacuum. For each type of photon, we 
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introduce annihilation and creation operators, a, and al x These obey the familiar 
commutation relations of the harmonic oscillator, 


laky» a) = ðk, kÔ, 


The annihilation operators have the property that a,|0) = 0. The quantum state of a 
single photon with momentum k and polarisation A is described by al ,|0). The general 
state of the quantum field is given by 


j )me 


Haah = |] = ao (4.33) 


kd 


This is the same kind of set-up that we saw in the lectures on Solid State Physics when 
discussing the quantisation of phonons. 


So far we have only described the Hilbert space of the electromagnetic field. It 
consists of an infinite number of harmonic oscillators, one for each k and À. Note that 
already here we’re dealing with something unfamiliar from the quantum mechanics 
perspective. Usually in quantum mechanics we fix the number of particles and then 
look at the Hilbert space. But here our Hilbert space contains states with different 
numbers of photons. Such Hilbert spaces are sometimes referred to as Fock spaces. 


The final step is to determine that Hamiltonian that governs the evolution of these 
states. This too is lifted from the harmonic oscillator: it is 


1 
H = ` (r) al xara + 5) 
kA 


Acting on our states (4.33) we have 


H|{nka}) = El{nxa}) with E= X nka hw(k) 


kd 


which agrees with the formula (4.32), now generalised to a large number of photons. 


Above, we have simply stated the Hilbert space and Hamiltonian for the electro- 
magnetic field. Of course, ultimately we should derive these results starting from the 
Maxwell equations. This will be done in the Quantum Field Theory course. 
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4.4.2 Coherent States 


Recall from our earlier lectures on the harmonic oscillator that there is a special state 
which most closely mimics a classical state. This is the coherent state. In the present 
context a coherent state is parameterised by a € C and consists of a sum of photons, 
each with the same wavevector and polarisation. We write a = aga. The coherent state 
can then be expressed as 


la) 2 gaini z e lal*/2e28" 9} 


where the equality follows from some standard manipulations of creation and annihi- 
lation operators. States of this kind are the closest that a quantum state gets to a 
classical plane wave. In particular, the classical expectation values of the electric and 
magnetic fields can be shown to oscillate back and forth with frequency w = kc. 


The coherent states are eigenstates of the annihilation operator, meaning that they 
are unchanged by the removal of a photon. The parameter œ determines the mean 
number of photons in the state, 


(n) = (aļa'aļa) = Jal? 


Coherent states play a particularly important role in quantum optics. In this context, 
they are sometimes referred to as Glauber states. (Roy Glauber was awarded the 2005 
Nobel prize for his work on optical coherence.) 


Making a Coherent State 


The light emitted by a laser is described by a coherent state. I’m not going to try 
to explain how a laser works here. (It’s to do with stimulated emission of a bunch of 
atoms.) But there is a simple model which explains how coherent states naturally arise: 
it is the driven harmonic oscillator, 


H = hw (ata + 5) + A(S Oa + f(a’) 


Here f(t) is a forcing function which excites the harmonic oscillator. In the context of 
electrodynamics, we think of at as creating photons of frequency w (and some unspec- 
ified polarisation). We will now show that the forcing term creates photons. 


We solve the Hamiltonian in the interaction picture, taking Hy = hw(ata+ +). Recall 
that states in the interaction picture are related to those in the Schrödinger picture by 
lw); = etot/h a5) 5. The interaction picture for the interaction Hamiltonian is 


Fs pietHoein( ftat Fat) ew tHot/h — nie f(a + et f(a’) 
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The states then evolve as |w(t)); = U;(t)|w(0))7;, where the unitary operator Ur obeys 


OU; 
ih— =H 
l at Ur 


You can check that the solution is given by 
Ur(t) = exp (a(t)a" —a*(that ip(t)) 


where a(t) = —i f dt! f(t/)e™ and y(t) = 4 fdt Im(a*a). (To check this, you'll need 


. . . . =eaal 
to use some commutation relations, in particular |e a 


aal ale = —a.) 
Now suppose that we drive the oscillator at its natural frequency, so that f(t) = 
foe. In this case, a(t) = —i fot and the states in the interaction picture are given 


by 
W(t) = eifolat+a)ingy | = e (fot)? /2¢—ifoatt/Qy , 


This is the coherent state |a). Equivalently, if we transform back to the Schrödinger 
picture, we have the coherent state 


IWE g = ey) = e~(Vot)?/2g—ifoe tatto 


The upshot of this discussion is that adding a forcing term to the harmonic oscillator 
drives the ground state to a coherent state. While this doesn’t explain the importance 
of coherent states in, say, laser physics, hopefully it at least provides some motivation. 


4.4.3 The Jaynes-Cummings Model 


Now that we have a description of the quantised electromagnetic field, we would like to 
understand how it interacts with atoms. Here we construct a simple, toy model that 
captures the physics. 


The first simplification is that we consider the atom to have just two states. This is 
essentially the same approximation that we made in Section 4.3 when discussing Rabi 
oscillations. Here we change notation slightly: we call the ground state of the system 
| |) and the excited state of the system | +). (These names are adopted from the 
notation for spin, but that’s not the meaning here. For example, | |) may describe the 
ls state of hydrogen, and | t) the 2p state.) 
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As in our discussion of Rabi oscillations, we take the energy splitting between the two 
states to be hwo. This means that, in the absence of any coupling to the electromagnetic 
field, our two-state “atom” is simply described by the Hamiltonian 


hun Ù 
ES A (4.34) 
2\ 0 —ħwo 


This atom will interact with photons of frequency w. We will only include photons 
with this frequency and no others. In reality, this is achieved by placing the atom in a 
box which can only accommodate photons of wavelength A = 27c/w. For this reason, 
the restriction to a single frequency of photon is usually referred to as cavity quantum 
electrodynamics. 


We will ignore the polarisation of the photon. Following our discussion above, we 
introduce the creation operator at. The Hilbert space of photons is then spanned by 
the states |n) = (at)"/Vn!|0), with Hamiltonian 


1 
H photon = fiw (aa + 5) (4.35) 


We often omit the zero-point energy ħw/2 since it only contributes a constant. 


Combining the two, the Hilbert space is H = Hatom ® Hphoton and is spanned by the 
states |n; f) and |n; 4), with n > 0. The Hamiltonian includes both (4.34) and (4.35), 
but also has an interaction term. We want this interaction term to have the property 
that if the excited state | +) decays to the ground state | |) then it emits a photon. 
Similarly, the ground state | |) may absorb a photon to become excited to | t+). This 
physics is captured by the following Hamiltonian 


Afw a 
Ayo oe : d Ez ħwata 
2 \ gal —wo 


This is the Jaynes-Cummings model. The constant g characterises the coupling between 
the atom and the photons. 


As we'll see, the Jaynes-Cummings model captures many of the features that we’ve 
seen already, including Rabi oscillations and spontaneous emission. However, you 
shouldn’t think of the photons in this model as little wavepackets which, when emit- 
ted, disappear off into the cosmos, never to be seen again. Instead, the photons are 
momentum eigenstates, spread throughout the cavity in which the atom sits. When 
emitted, they hang around. This will be important to understand the physics. 
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We now look at the dynamics of the Jaynes-Cummings model. The state |0, |) de- 
scribes an atom in the ground state with no photons around. This state is an eigenstate 
of Hzc with energy Hyc|0,)) = —$e|0, 4). 


However, the state |0, +), describing an excited atom in the vacuum is not an eigen- 
state. It can evolve into |1, 4}, describing an atom in the ground state with one photon. 
More generally, the Hilbert space splits into sectors with the |n — 1,1) state mixing 
with the |n, |) state. Restricted to these two states, the Hamiltonian is a 2 x 2 matrix 
given by 


1 1 1 
H, = (r — 5) wla + 5 (Wo — w)o? + zI vno 


where a’ are the Pauli matrices. The two eigenstates are 


jn4) = sin ĝ|n — 1,4) — cos Al|n, 4) 
In} = cos6|n — 1, t) + sin A|n, |) 


where 
tan(20) = ain , =w w (4.36) 


ô is the same detuning parameter we used before. When ô = 0, we are on resonance, 
with the energy of the photon coinciding with the energy splitting of the atom. In 
general, two energy eigenvalues are 


E, = (r + 5) hw + Sygn 6? 


Let’s now extract some physics from these solutions. 


Rabi Oscillations Revisited 


Consider an atom in the ground state, surrounded by a fixed number of photons n. 
The initial state is |Y (t = 0)) = |n, |) = sin @|n_) — cos @|n,). The state subsequently 
evolves as 


\W(t)) = lento sin 6|n_) — e~*#+*/" cog A|n)] 
From this, we can extract the probability of sitting in the excited state 


2 2 2 
pnei _ sin? a r) 


= s 
g?n + 6? 2 
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This agrees with our earlier result (4.23) which was derived for an atom sitting in a 
classical electric field. Note that the Rabi frequency (4.20) should be equated with 
Q = gyn. This makes sense: the coupling g is capturing the matrix element, while the 
number of photons n is proportional to the energy stored in the electromagnetic field, 
so y/n is proportional to the amplitude of the electric field. 


Death and Resurrection 


The Jaynes-Cummings model captures also new physics, not seen when we treat the 
electromagnetic field classically. This is simplest to see if we tune the photons to 
resonance, setting ô = 0. With this choice, (4.36) tells us that cos @ = sin 0 = 1/V2. 


We again place the atom in its ground state, but this time we do not surround it with 
a fixed number of photons. Instead, we place the electromagnetic field in a coherent 
state 


[0.6] Qa” 
W) = ela? /2aatg |) = e7? n, 4 
w) 0,4) D gag) 
We will take the average number of photons in this state to be macroscopically large. 
This means |a| > 1. Now the evolution is given by 


|W(t)) = ela? tet) /2 = 1 [eos (=) In, 4)}) + isin (2) In — 1.1) 


The probability to find the atom in its excited state is 


n=0 


S 2n t 
P(t) = e lel? = A sin? (2 ) 
n! 
n=0 


Now there are many oscillatory contributions to the probability, each with a different 
frequency. We would expect these to wash each other out, so that there are no coherent 
oscillations in the probability. Indeed, we we will now see, this is what happens. But 
there is also a surprise in store. 


To analyse the sum over different frequencies, we first rewrite the probability as 


P,(t) = ele? 3 lal” G — 5 costs Vit) anes Pela 3 e: cos(gy/nt) 
= n! 2 2 2 2 FE n! 
where, in the second equality, we have used the Taylor expansion of the exponential. 
The sum is sharply peaked at the value n ~ |a|?. To see this, we use Stirlings formula 
to write 
lo 1 


e” log |a|?—n log n+n 
n! 27M 
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Figure 24: Rabi Oscillations at short Figure 25: ...and their decay at longer 
times... times. 


The exponent f(n) = 2nala|—nlog n+n has a maximum at f’(n) = log |a|?—log n = 0, 
or n = |a|?. We then use f”(n) = —1/n. Taylor expanding around the maximum, we 
have 
lol 1 plal?—m2/2Ia)2 
n! / 27 |a|? 
where m = n — |a|?. With |a|? sufficiently large, the sum over m effectively ranges 


from —oo to +00. We have 


Py 


L. die 1 eee 
t a> e7™ /2lal cos (gt al? +m) 4.37 


Let’s now try to build some intuition for this sum. First note that for very short time 
periods, there will be the familiar Rabi oscillations. A single cycle occurs with period 
gT|a| = 27, or 

27 

gla 

These oscillations occur at a Rabi frequency determined by the average number of 
photons (n) = |a|?. In the first figure, we’ve plotted the function (4.37) for |a| = 20 
and times gt < 2. We clearly see the Rabi oscillations at these time scales 


TRabi X 


There are other features that occur on longer time scales. The exponential sup- 
pression means that only the terms up to |m| ~ |a| will contribute in a significant 
way. If, over the range of these terms, we get a change of phase by 2m then we 
expect destructive interference among the different oscillations. This occurs when 
gT( ya + [al — al) = 27, or 


4r 
Toollapse x — 
g 
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Figure 26: Once decayed, they stay de- Figure 27: ...until they don’t! 
cayed... 


This tells us that after approximately |a| Rabi oscillations, the probability asymptotes 
to P = z. This is the expected behaviour if the atom is subjected to lots of different 
frequencies. This collapse is clearly seen in the first right-hand figure, which plots the 
function (4.37) for |a| = 20 and time scales up to gt < 10. Indeed, the left-hand plot 
of the next diptych extends the timescale to gt ~ 50, where we clearly see that the 
probability settles to P, = 5. 


However, there is a surprise in store! At much longer timescales, each term in the 
sum picks up the same phase from the cos factor: i.e. cos(gT|a|) = cos(gT y |a|? + 1), 
or gT(4/\a|/? +1 — |a|) = 27. This occurs when 


Ar|al| 
g 


. MN 
d revival Zh 


On these time scales, the terms in the sum once œ 
again add coherently and we can find the particle 
in the excited state with an enhanced probabil- 
ity. This is called quantum revival and is clearly est 
seen in the second right-hand plot. Note that 
the probability in the revival never reaches one, 
nor dips to zero. 


Revival is a novel effect that arises from the 
quantisation of the electromagnetic field; it has Figure 28: 
no classical analog. Note that this effect does 
not occur because of any coherence between the individual photon states. Rather, it 
occurs because of the discreteness of the electromagnetic field 
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Finally, we can ask what the probability looks like on extremely long time scales 
t >> Trevival: On the right, we continue our plots to gt = 5000. We see a number of 
collapses and revivals, until the system becomes noisy and fluctuating at large times. 
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5. Quantum Foundations 


What is the essence of quantum mechanics? What makes the quantum world truly 
different from the classical one? Is it the discrete spectrum of energy levels? Or the 
inherent lack of determinism? 


The purpose of this chapter is to go back to basics in an attempt to answer this 
question. For the most part, we will not be interested in the dynamics of quantum 
systems (although Section 5.5 is an exception). Instead, we will look at the framework 
of quantum mechanics in an attempt to get a better understanding of what we mean 
by a “state”, and what we mean by a “measurement”. 


5.1 Entanglement 


“I would not call that one but rather the characteristic trace of quantum 
mechanics, the one that enforces its entire departure from classical lines of 
thought” 


Erwin Schrodinger on entanglement 


The differences between the classical and quantum worlds are highlighted most em- 
phatically when we look at a property called entanglement. This section and, indeed, 
much of this chapter will be focussed on building the tools necessary to understand the 
surprising features of entangled quantum states. 


Entanglement is a property of two or more quantum systems. Here we consider two 
systems, with associated Hilbert spaces Hı and Ho respectively. The Hilbert space of 
the combined system is then Hı ® Hə. A state of this combined system is said to be 
entangled if it cannot be written in the form 


|W) = |1) 8 |e») (5.1) 


For example, suppose we have two particles, each of which can have one of two states. 
This is called a qubit. We take a basis of this Hilbert space to be the spin in the 
z-direction, with eigenstates spin up |¢) or spin down ||). Then the state 


IY) =|t) @IL) 


is not entangled. In contrast, the state 


1 


IY) = S(T) 814) - lL) 814) 


l 


2 
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is entangled. In fact, this is the most famous of all entangled states and is usually 
known as an EPR pair, after Einstein, Podolsky and Rosen. Note that this state is 
a sum over states of the form (5.1) and cannot be written in a simpler form; this is 
what makes it entangled. In what follows, we’ll simplify our notation and drop the & 
symbol, so the EPR pair is written as 


1 


JEPR) = 504) — lt) (5.2) 


NI 


2 


To illustrate the concept of entanglement, we could just as easily have chosen the states 
IY) = BUTI) +1 L114) or [¥) = (IT) 4) +1 4)/4)). Both of these are also 
entangled. However, just because a state is written as a sum of terms of the form (5.1) 
does not necessarily mean that it’s entangled. Consider, for example, 

1 


V2 


This can also be written as |V) = |—)||) where |>} = Z(t) + |{)) and so this state 
is not entangled. We’ll provide a way to check whether or not a state is entangled in 
Section 5.3.3. 


|v) (ITIL) IHI) 


5.1.1 The Einstein, Podolsky, Rosen “Paradox” 


In 1935, Einstein, Podolsky and Rosen tried to use the property of entanglement to 
argue that quantum mechanics is incomplete. Ultimately, this attempt failed, revealing 
instead the jarring differences between quantum mechanics and our classical worldview. 


Here is the EPR argument. We prepare two particles in the state (5.2) and subse- 
quently separate these particles by a large distance. There is a tradition in this field, 
imported from the world of cryptography, to refer to experimenters as Alice and Bob 
and it would be churlish of me to deny you this joy. So Alice and Bob sit in distant 
locations, each carrying one of the spins of the EPR pair. Let’s say Alice chooses to 
measure her spin in the z-direction. There are two options: she either finds spin up |f} 
or spin down |J) and, according to the rules of quantum mechanics, each of these hap- 
pens with probability 50%. Similarly, Bob can measure the spin of the second particle 
and also finds spin up or spin down, again with probability 50%. 


However, the measurements of Alice and Bob are not uncorrelated. If Alice measures 
the first particle to have spin up, then the EPR pair (5.2) collapses to |t)|{), which 
means that Bob must measure the spin of the second particle to have spin down. It 
would appear, regardless of how far apart they are, the measurement of Alice determines 
the measurement of Bob: whatever Alice sees, Bob always sees the opposite. Viewed 
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in the usual framework of quantum mechanics, these correlations arise because of a 
“collapse of the wavefunction” which happens instantaneously. 


Now, for any theoretical physicist — and for Einstein in particular — the word “in- 
stantaneous” should ring alarm bells. It appears to be in conflict with special relativity 
and, although we have not yet made any attempt to reconcile quantum mechanics with 
special relativity, it would be worrying if they are incompatible on such a fundamental 
level. 


The first thing to say is that there is no direct conflict with locality, in the sense that 
there is no way to use these correlations to transmit information faster than light. Alice 
and Bob cannot use their entangled pair to send signals to each other: if Bob measures 
spin down then he has no way of knowing whether this happened because he collapsed 
the wavefunction, or if it happened because Alice has already made a measurement and 
found spin up. Nonetheless, the correlations that arise appear to be non-local and this 
might lead to a sense of unease. 


There is, of course, a much more mundane explanation for the kinds of correlations 
that arise from EPR pairs. Suppose that I take off my shoes and give one each to Alice 
and Bob, but only after I’ve sealed them in boxes. I send them off to distant parts of 
the Universe where they open the boxes to discover which of my shoes they’ve been 
carrying across the cosmos. If Alice is lucky, she finds that she has my left shoe. (It is 
a little advertised fact that Alice has only one leg.) Bob, of course, must then have my 
right shoe. But there is nothing miraculous or non-local in all of this. The parity of 
the shoe was determined from the beginning; any uncertainty Alice and Bob had over 
which shoe they were carrying was due only to their ignorance, and my skill at hiding 
shoes in boxes. 


This brings us to the argument of EPR. The instantaneous collapse of the wavefunc- 
tion in quantum mechanics is silly and apparently non-local. It would be much more 
sensible if the correlations in the spins could be explained in the same way as the corre- 
lations in shoes. But if this is so, then quantum mechanics must be incomplete because 
the state (5.2) doesn’t provide a full explanation of the state of the system. Instead, 
the outcome of any measurement should be determined by some property of the spins 
that is not encoded in the quantum state (5.2), some extra piece of information that 
was there from the beginning and says what the result of any measurement will give. 
This hypothetical extra piece of information is usually referred to as a hidden variable. 
It was advocated by Einstein and friends as a way of restoring some common sense to 
the world of quantum mechanics, one that fits more naturally with our ideas of locality. 
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There’s no reason that we should have access to these hidden variables. They could 
be lying beyond our reach, an inaccessible deterministic world which we can never 
see. In this picture, our ignorance of these hidden variables is where the probability of 
quantum mechanics comes from, and the uncertainties of quantum mechanics are then 
no different from the uncertainties that arise in the weather or in the casino. They are 
due, entirely, to lack of knowledge. This wonderfully comforting vision of the Universe 
is sometimes called local realism. It is, as we will now show, hopelessly naive. 


5.1.2 Bell’s Inequality 


The hypothetical hidden variables that determine the measurements of spin must be 
somewhat more subtle than those that determine the measurement of my shoes. This 
is because there’s nothing to stop Alice and Bob measuring the spin in directions other 
than the z-axis. 


Suppose, for example, that both choose to measure the spin in the x-direction. The 
eigenstates for a single spin are 
o 1 o 1 


v2 v2 


with eigenvalues +h/2 and —h/2 respectively. We can write the EPR pair (5.2) as 


mH (MID) o Ie) (It) — 14) 


5 ral YI) = |e) 


So we again find correlations if the spins are measured along the z-axis: whenever Alice 
finds spin +h/2, then Bob finds spin —h/2 and vice-versa. Any hidden variable has to 
account for this too. Indeed, the hypothetical hidden variables have to account for the 


JEPR) = 0M4) — It) = 


measurement of the spin along any choice of axis. This will prove to be their downfall. 


A Review of Spin 


Before we proceed, let’s first review a few facts about how we measure the spin along 
different axes. An operator that measures spin along the direction a = (sin 0, 0, cos 8) 


is 
cos@ sind 
o:a= 
sin @ — cos 0 


Below we’ll denote this matrix as ø -a = dg. It has eigenvectors 


0 0 0 0 
|04) = cos 5|t) +sin5i4) and |@_) = sin 5|t) + cos S11) 
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From this, we learn that if we prepare a state in, say, ||), then the probability P(@ 
of measuring either spin + or spin — along the vector a is 


SS 


0 0 
PU.) = sin? and P(6_) = cos? F 


From the form of the eigenstates |04}, we see that the EPR pair can be written as 


1 
|EPR) = alle) ) — |0_)|@4)) (5.3) 


for any 0. This means that, as long as Alice and Bob both choose to measure the spin 
along the same direction a, then their results will always be perfectly anti-correlated: 
when one measures spin + the other is guaranteed to measure spin —. This is a special 
property of the EPR pair that is not shared by other entangled states. It follows from 
some group theory: under addition of angular momentum ; ® t = 0 @ 1, and the EPR 
state is the rotationally invariant singlet. 


What’s Wrong with Hidden Variables 


Suppose now that Alice measures the spin along the z-axis, and Bob measures the spin 
along the a axis. If Alice measures spin | +), then we know that Bob has spin |4), 
so whether he measures spin + or — is determined by the probabilities above. We’ll 
write this as P(o4, oğ) where o^ denotes the spin measured by Alice and o” the spin 
measured by Bob. The four possibilities are 


1.30 1 0 
P(of = +,0f = +) = 5 sin’ 5 , Pta=+,e7 =-)= 5 008" 5 

1 7] 1 7] 
Plot = -of =+) = 50085 > Plot=—,of=-)= 585 (54) 


Note, in particular, that if 0 = 0 so that Alice and Bob measure the spin along the 
same axis, then we revert to our previous perfect anti-correlation. 


It is not difficult to account for these results in a hidden variables theory. Each of 
the particles carries with them two labels s, and sg which have values +1 or —1 and 
determine the result of a spin measurement along the z-axis and a axis respectively. 
The perfect anti-correlation means that the value of each spin for Bob’s particle must 
be the opposite of Alice’s. We write s? = —s4 and s? = —s#!. We then only need to 
talk about the probability distribution p(s4, sé!) for the spins of Alice’s particles. To 
reproduce the predictions (5.4), we must take these to be 


1 6 1 6 
P(s? = +,89 =—) = 58in’ 5 , P(st=+,sf= H) = 5 008" 5 
1 6 1 0 
P(s2 = —, 89 = —) = 5 008" 5 : Pe = S9 = = on (5.5) 
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Mathematically this is straightforward: the probability distributions are, after all, es- 
sentially the same as those in (5.4). But physically we’ve done something a little 
slippery. We’ve said that whenever Bob measures his spin og to be, say, +1 then this 
determines the spin of Alice’s particle to be sg = —1 even though Alice didn’t measure 
the spin in the direction a. In this way, we’ve managed to assign labels to Alice’s 
particle corresponding to spin in two different directions. But this is against the spirit 
of quantum mechanics because these operators for spins in different directions don’t 
commute. Indeed, we will now see that the spirit of quantum mechanics will come back 
and bite us. 


The trouble comes when we throw a third possible measurement into the mix. Sup- 
pose that Alice and Bob are given a choice. Each can measure the spin along the z-axis, 
along the a = (sin 6,0, cos 0) axis or along the b = (sin @,0, cos @) axis. Now each par- 


ticle must be assigned a hidden variable that determines the choice of each of these 
A 


Zz? 


measurements. So Alice’s particle comes with s4, s4 and e7; each of which can take 


value +1. The probabilities of the different choices are governed by some distribution 


pls, så, arj We will now show that no such distribution exists that can reproduce the 


results of measurements of the EPR pair. 


Let’s assume that such a distribution does exist. This implies certain relations be- 
tween the probability distributions P(s4, s4). For example, by summing over the vari- 


ables which weren’t measured, we find 


Pel = +, 83 ==) = p(++—-)+p(-+-) 
< [p(++—-)+p+ +4) + [p(-—+—-)+p(-- -) 
= Pls, +89 =) + Pls, a ==) 
But we know what each of these distributions P(s“,s“) must be: they are given by 
(5.5). This then gives the Bell inequality 
0 — 0 
sin? 7 g < cos? 5 + cos? (5.6) 


where the left-hand side follows from the rotational invariance (5.3) of the EPR state. 


There’s a problem with the Bell inequality (5.6): it’s simply not true for all values 
of 6 and ¢! Suppose, for example, that we take 0 = 37/2 and @ = 37/4. Then 
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Meanwhile 


ge == 
4 2 


Obviously 1/2 < 1/./2. These values violate the Bell inequality. 


The Bell inequality (5.6) was derived under the assumption that there was some 
hidden variable underlying quantum mechanics. Its violation tells us that this is simply 
not possible. Of course, physics is an experimental science and we can ask whether or 
not the Bell inequalities are violated in Nature. They are. The experiment was first 
done in the early 1980s by Aspect and has been repeated many times since, with 
different groups trying to finesse the experiments in order to close off increasingly 
preposterous loopholes that philosophers claim to have discovered in the argument. 


The original EPR argument was an attempt to show that locality, together with com- 
mon sense, imply that there should be hidden variables underlying quantum mechanics. 
Nature, however, disagrees. Indeed, the Bell inequalities turn the EPR argument com- 
pletely on its head. If you want to keep locality, then you’re obliged to give up common 
sense which, here, means a view of the world in which particles carry the properties 
that are measured. In contrast, if you want to keep common sense, you will have to give 
up locality. Such a loophole arises because the derivation of Bell’s inequality assumed 
that a measurement on one particle does not affect the probability distribution of the 
other. Given that the two particles can be separated by arbitrarily large distances, any 
such effect must be superluminal and, hence, non-local. Therefore, the best one can 
say is that Bell’s argument forbids local hidden variable theories. 


Most physicists cherish locality over common sense. In particular, all of our most 
successful laws of physics are written in the language of Quantum Field Theory, which 
is the framework that combines quantum mechanics with local dynamics. With locality 
sitting firmly at the heart of physics, it is very difficult to see role for any kind of hidden 
variables. 


It is sometimes said that the correlations inherent in EPR-type pairs are non-local. I 
don’t think this is a particularly helpful way to characterise these correlations because, 
as we have seen, there is no way to use them to signal faster than light. Nonetheless, it 
is true that the correlations that arise in quantum mechanics cannot arise in any local 
classical model of reality. But the key lesson to take from this is not that our Universe 
is non-local; it is instead that our Universe is non-classical. 
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5.1.3 CHSH Inequality 


The essence of Bell’s inequality can be distilled to a simpler form, due to Clauser, 
Horne, Shimony and Holt. 


We stick with the general framework where both Alice and Bob are each sent a two- 
state quantum system. Alice can choose to measure one of two quantum observables, 
A, or Ag. Similarly, Bob can choose to measure Bı or By. Each of these observables 
has two possible eigenvalues, a; +1 and b; = +1. 


We require that 


This is the statement that Alice and Bob can happily perform their measurements 
without interfering with the other. In particular, this is where the assumption of 
locality comes in: if Alice and Bob are spacelike separated then (5.7) must hold. In 
contrast, we will make no such assumption about |Aj, A2] or [B1, Bə]. 


We’re going to look at the expectation value of the observable 
C = (A, + Az) Bı + (A, m A>) Bo (5.8) 


We do this first in a hidden variable theory, and next in the quantum theory. We’ll 
see that a hidden variable theory places a stricter range on the allowed values of the 
expectation value (C’). To see this, we make the seemingly innocuous assumption that 
the system possesses well-defined values for a; and b;. In this case, we write 


Ch. = (a1 + a2)b1 + (a1 — az)bz (5.9) 
But since a; = +1, then there are two possibilities 
ea, +a2=0 > aı— a= +2 
ea—-m=0 Sata. = £2 


In either case, Cnv. = +26; for some b;. Since b; can only take values +1, we have 
|(b:)| < 1, and so 


—2< (Chy.) < 2 (5.10) 


This is the CHSH inequality. It is entirely analogous to the Bell inequality (5.6). 
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What about in quantum theory? Now we don’t admit to a; and a> having simul- 
taneous meaning, so we’re not allowed to write (5.9). Instead, we have to manipu- 
late (5.8) as an operator equation. Because the eigenvalues are +1, we must have 
A? = A} = B? = B$ = 1, the identity operator. After a little algebra, we find 


oe = Al — A, A,B, Bo + A A,B, Bo + A, A, BB, = A A, BoB, 
= 41 == [Ai, Ao] |B, By] 


Now |([A1, Ag])| < |(A1A2)| + |(A2A1)| < 2, since each operator has eigenvalue +1. 
From this we learn that in the quantum theory, 


(C?) <8 
Since (C°) > (C)?, we find that the range of values in quantum mechanics to be 
=a O 290 


This is referred to as the Cirel’son bound. Clearly the range of values allowed by 
quantum mechanics exceeds that allowed by hidden variables theories (5.10). 


It remains for us to exhibit states and operators which violate the CHSH bound. For 
this, we can return to our spin model. From (5.4), we know that 


0 0 
(EPR| of & o? |EPR) = sin? T cos? cos 6 
This means that if we take the four operators A», B1, A, and By to be spin operators, 
aligned in the (x,y) at successive angles of 45°. (i.e. A> has 0 = 0, Bı has 0 = 4, Ay 
has 0 = 5 and By has 0 = 3r) then 


(A, Bi) = (Ai Bı) = (Ai Bi) = — and (A,B y) = + 


ela 


and we see that 


saturating the Cirel’son bound. 


5.1.4 Entanglement Between Three Particles 


If we consider the case of three particles rather than two, then there is even sharper 
contradiction between the predictions of quantum mechanics and those of hidden vari- 
ables theories. As before, we’ll take each particle to carry one of two states, with a 
basis given by spins |f} and ||), measured in the z-direction. 
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Consider the entangled three-particle state 
1 
V2 


named after Greenberger, Horne and Zeilinger. These three particles are sent to our 


IGHZ) = MIDI = LL) 


three intrepid scientists, each waiting patiently in far-flung corners of the galaxy. Each 
of these scientists makes one of two measurements: they either measure the spin in the 
x-direction, or they measure the spin in the y-direction. Obviously, each experiment 
gives them the result +1 or —1. 


The state |GHZ) will result in correlations between the different measurements. 
Suppose, for example, that two of the scientists measure a, and the other measures oy. 
It is simple to check that 


of Do? Q oS |GHZ) =0ĉ Q o oS |GHZ) = Do D o? ® 0° |GHZ) =4+|GHZ) 
In other words, the product of the scientist’s three measurements always equals +1. 


It’s tempting to follow the hidden variables paradigm and assign a spin s, and s, 
to each of these three particles. Let’s suppose we do so. Then the result above means 
that 


aces, = sia se = sisi se = +1 (5.11) 
But from this knowledge we can make a simple prediction. If we multiply all of these 


results together, we get 
C— A B.C _ 
sy =+1 = sysfs, = +1 (5.12) 


where the implication follows from the fact that the spin variables can only take val- 
ues +1. The hidden variables tell us that whenever the correlations (5.11) hold, the 
correlation (5.12) must also hold. 


Let’s now look at what quantum mechanics tells us. Rather happily, |G H Z} happens 
to be an eigenstate of of @ 0? @ of. But we have 


of @ of @ of |GHZ) = -|GHZ) 


In other words, the product of these three measurements must give —1. This is in stark 
contradiction to the hidden variables result (5.12). Once again we see that local hidden 
variables are incapable of reproducing the results of quantum mechanics. 
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If Only We Hadn’t Made Counterfactual Arguments... 


In both the Bell and GHZ arguments, the mistake in assigning hidden variables can be 
traced to our use of counterfactuals. This is the idea that we can say what would have 
happened had we made different choices. 


Suppose, for example, that Alice chooses to measure a, to be +1 in an EPR state. 
Then Bob can be absolutely certain that he will find ø, to be —1 should he choose to 
measure it. But even that certainty doesn’t give him the right to assign s? = —1 unless 
he actually goes ahead and measures it. This is because he may want to measure spin 
along some other axis, oë, and assuming that both properties exist will lead us to the 
wrong conclusion as we’ve seen above. The punchline is that you don’t get to make 
counterfactual arguments based on what would have happened: only arguments based 
on what actually did happen. 


5.1.5 The Kochen-Specker Theorem 


The Kochen-Specker theorem provides yet another way to restrict putative hidden- 
variables theories. Here is the statement: 


Consider a set of N Hermitian operators A; acting on H. Typically some of these 
operators will commute with each other, while others will not. Any subset of operators 
which mutually commute will be called compatible. 


In an attempt to build a hidden variables theory, all observables A; are assigned a 
value a; € R. We will require that whenever A, B and C € {A;} are compatible then 
the following properties should hold 


e If C = A+ B then c=a +b. 
e If C = AB then c = ab. 


These seem like sensible requirements. Indeed, in quantum mechanics we know that if 
[A, B] = 0 then the expectation values obey the relations above and, moreover, there 
are states where we can assign definite values to A, B and therefore to A+ B and to 
AB. We will not impose any such requirements if |A, B] # 0. 


As innocuous as these requirements may seem, the Kochen-Specker theorem states 
that in Hilbert spaces H with dimension dim(H) > 3, there are sets of operators {A;} 
for which it is not possible to assign values a; with these properties. Note that this 
isn’t a statement about a specific state in the Hilbert space; it’s a stronger statement 
that there is no consistent values that can possibly be assigned to operators. 
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The issue is that a given operator, say A, can be compatible with many different 
operators. So, for example, it may appear in the compatible set (A, B,C) and also in 
(A, D, E) and should take the same value a in both. Meanwhile, B may appear in a 
different compatible set and so on. The proofs of the Kochen-Specker theorem involve 
exhibiting a bunch of operators which cannot be consistently assigned values. 


The original proof of the Kocken-Specker theorem is notoriously fiddly, involving a 
set of N = 117 different projection operators in a dim(H) = 3 dimensional Hilbert 
space*. Simpler versions of the proof with dim(H) = 3 now exist, although we won’t 
present them here. 


There is, however, a particularly straightforward proof that involves N = 18 opera- 
tors in a dim(H) = 4 dimensional Hilbert space. We start by considering the following 
18 vectors Y; € C’, 


yı = (0,0,0,1) , w2=(0,0,1,0) , %3= (1,1,0,0) , ws=(1,-1,0,0) 
ws = (0,1,0,0) , e= (1,0,1,0) , %7=(1,0,—1,0) , s= (1,—1,1,—1) 
Py = (1, =l; =l; 1) , Pio = (0,0,1, 1) , Yu= (1, 1,1, 1) , P= (0, 1,0, =I) 
v3 = (1,0,0,1) , Yı =(1,0,0,-1) , vis =(0,1,-1,0) , We = (1,1,-1,1) 
pır = (1,1,1, —1) ’ Wpis = (—1,1,1,1) 


From each of these, we can build a projection operator 


adel 
* (ilha 


Since the projector operators can only take eigenvalues 0 or 1, we want to assign a 
value p; = 0 or p; = 1 to each projection operator P;. 


Of course, most of these projection operators do not commute with each other. 
However, there are subsets of four such operators which mutually commute and sum 
to give the identity operator. For example, 


Pi + Po + Ps + P= 1, 


In this case, the requirements of the Kocken-Specker theorem tell us that one of these 
operators must have value p = 1 and the other three must have value p = 0. 


“More details can be found at https://plato.stanford.edu/entries/kochen-specker/. 
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Now comes the twist. We can, in fact, construct nine such subsets of four operators. 
These are listed in the columns of the following table: 


Bie | P| Ps | Ps | Po | P | Pie | Pis | Piz 
P | P; | P | Pa | Ps | Pu | Piz | Pis | Pis 
P; | Ps | Ps | Pr | Pis | Pia | Pa | Fe | Pis 
P, | Pr | Pio | Pi2 | Pia | Pais | Pio | Pia | Pis 


This table has the nice property that each P; appears in exactly two different columns. 
Now the task is clear: assign values p; = 0,1 to each P; such that each column has a 
single p = 1 and three p = 0. It is best to sit down and try to do this. And then try 
again. By the time you’ve tried for the third time, it should be increasingly clear that 
no consistent assignment of values p; is possible. And the reason is clear: because each 
projection operator appears twice, if you assign p = 1 to any projection operator, you 
will always end up with an even number of values p = 1 in the table. But the goal is 
only achieved if you assign one to each of the nine rows so we want an odd number. 
Clearly it’s not possible. This is the Kochen-Specker theorem. 


5.2 Entanglement is a Resource 


In the previous section, we used entangled states to reveal how quantum mechanics 
differs from our older, classical framework. In this section, we will view entanglement 
somewhat differently. It is a precious commodity that allows us to achieve things that 
classical physicists cannot. 


5.2.1 The CHSH Game 


To illustrate the advantage that entanglement brings, we start by describing a game. 
It’s not a particularly fun game. It’s designed purely as a point of principle to show 
that entanglement can be useful. 


The game is one of cooperation between two players — Alice and Bob of course — who 
cannot communicate with each other, but can prepare a strategy beforehand. Alice 
and Bob are both given an envelope. Inside each envelope is either a red card or a blue 
card. This means that there are four possibilities for their cards: red/red, red/blue, 
blue/red or blue/blue. 


After seeing their card, Alice and Bob have to decide whether to say “turtles” or to 
say “cucumber”. This is, I think you will agree, a silly game. The rules are as follows: 


e Alice and Bob win if both cards are red and they said different words. 
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e Alice and Bob win if at least one card was blue and they said the same word. 
e Otherwise, they lose. 


What’s their best strategy? First suppose that Alice and Bob are classical losers and 
have no help from quantum mechanics. It’s not hard to convince yourself that their 
best strategy is just to say “cucumber” every time, regardless of the colour of their 
card. They only lose if both cards turn out to be red. Otherwise they win. This means 
that they win 75% of the time. 


Suppose, however, that Alice and Bob have spent many decades developing coherent 
qubits. This pioneering technology resulted in them being kidnapped by a rival govern- 
ment who then, for reasons hard to fathom, subjected them to this stupid game. Can 
their discoveries help them get out of a bind? Thankfully, the answer is yes. Although, 
arguably, not so much that it’s worth all the trouble. 


To do better, Alice and Bob must share a number of EPR pairs, one for each time 
that the game is played. Here is their gameplan. Whenever Alice’s card is blue, she 
measures A; whenever it is red she measures Ay. Whenever these measurements give 
+1 she says “turtles”; whenever it is —1 she says “cucumber”. Bob does something 
similar: Bı when blue, By when red; “turtles” when +1, “cucumber” when —1. 


Suppose that both cards are blue. Then they win if A; and Bı give the same result 
and lose otherwise. In other words, they win if the measurement gives A;B, = +1 and 
lose when A,B, = —1. This means 


P(win) — P(lose) = (A Bı) 


In contrast, if both cards are red then they lose if Az and Bə give the same measurement 
and win otherwise, so that 


P(win) — P(lose) = —(A2Bo) 
Since each combination of cards arises with probability p = E, the total probability is 
1 
P(win) — P(lose) = (AB + A, By + AəBı _ Az Bo) 


But we’ve seen this before: it’s precisely the combination of operators (5.8) that arose 
in the CHSH proof of the Bell inequality. We can immediately import our answer from 
there to learn that 


f | 
P(win) — P(lose) < a 
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We saw previously that we can find operators which saturate this inequality. Since 
P(win) + P(lose) = 1, there’s a choice of measurements A; and B; — essentially spin 
measurements which differ by 45° — which ensures a win rate of 


1/1 
P(win) = 5 Ga F i) ~ 0.854 


This beats our best classical strategy of 75%. 


Having the ability to win at this particular game is unlikely to change the world. 
Obviously the game was cooked up by starting from the CHSH inequality and working 
backwards in an attempt to translate Bell’s inequality into something approximating 
a game. But it does reveal an important point: the correlations in entangled states 
can be used to do things that wouldn’t otherwise be possible. If we can harness this 
ability to perform tasks that we actually care about, then we might genuinely be able 
to change the world. This is the subject of quantum information. Here we give a couple 
of simple examples that move in this direction. 


5.2.2 Dense Coding 


For our first application, Alice wants to send Bob some classical information, which 
means she wants to tell him “yes” or “no” to a series of questions. This is encoded in 
a Classical bit as values 0 and 1. 


However, Alice is fancy. She has qubits at her disposal and can send these to Bob. 
We'd like to know if she can use this quantum technology to aid in sending her classical 
information. 


First note that Alice doesn’t lose anything by sending qubits rather than classical 
bits. (Apart, of course, from the hundreds of millions of dollars in R&D that it took to 
get them in the first place.) She could always encode the classical value 0 as |) and 
l as ||) and, provided Bob is told in advance to measure o,, the qubit contains the 
same information as a classical bit. But this does seem like a waste of resources. 


Is it possible to do better and transmit more than one classical bit in a single qubit? 
The answer is no: a single qubit carries the same amount of information as a classical 
bit. However, this changes if Alice’s qubit is actually part of an entangled pair that 
she shares with Bob. In this case, she can encode two classical bits of information in a 
single qubit. This is known as dense coding. 
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To achieve this feat, Alice first performs an operation on her spin. We’ll introduce 
some new notation for this state that will become useful in the following section: we 
call the EPR pair 


JEPR) = hO) = (NI) = t) 


Alice then has four options: 
e She does nothing. Obviously, the entangled pair remains in the state |,_). 


e Alice acts with oz. This changes the state to —|¢~) where 
1 


I) gilt) — WL) 


e Alice acts with o}. This changes the state to —i|¢*) where 


1 


|$") = Ji 


(It IT) + ILL) 


e Alice acts with a,. This changes the state to |x"). 
1 


V2 


The upshot of this procedure is that the entangled pair sits in one of four different 


bag) (ITIL) + 121) 


states 
B 1 _ 1 


v2 v2 


Alice now sends her qubit to Bob, so Bob has access to the whole entangled state. 


|e") (ITT) =IL)IL)) or [x*) (MIL +1 1T)) (5.13) 


Since the four different states are orthogonal, it must be possible to distinguish them 
by performing some measurements. Indeed, the measurements Bob needs to make are 


O,®o, and c: 80; 


These two operators commute. This means that, while we don’t get to know the values 
of both sz and s, of, say, the first spin, it does make sense to talk about the products 
of the spins of the two qubits in both directions. It’s simple to check that the four 
possible states above are eigenstates of these two operators 


Or Q Orl) = |7) and GeO pS) (5.14) 
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o: 0:07) = +107) and o: 80X) = —|x7) 


So, for example, if Bob measures c} ® 0, = +1 and a, ® o, = —1, then he knows that 
he’s in possession of state |x}. Bob then knows which of the four operations Alice 
performed. In this way she has communicated two classical bits of information through 
the exchange of a single qubit. 


Admittedly, two qubits were needed for this to fly: one which was exchanged and 
one which was in Bob’s possession all along. In fact, in Section 5.3.2, we’ll show that 
entanglement between spins can only be created if the two spins were brought together 
at some point in the past. So, from this point of view, Alice actually exchanged two 
qubits with Bob, the first long ago when they shared the EPR pair, and the second 
when the message was sent. Nonetheless, there’s still something surprising about dense 
coding. The original EPR pair contained no hint of the message that Alice wanted 
to send; indeed, it could have been created long before she knew what that message 
was. Nor was there any information in the single qubit that Alice sent to Bob. Anyone 
intercepting it along the way would be no wiser. It’s only when this qubit is brought 
together with Bob’s that the information becomes accessible. 


5.2.3 Quantum Teleportation 


Our next application has a sexy sounding name: quantum teleportation. To put it in 
context, we first need a result that tells us what we cannot do in quantum mechanics. 


The No Cloning Theorem 


The no cloning theorem says that it is impossible to copy a state in quantum mechanics. 


Here’s the game. Someone gives you a state |v), but doesn’t tell you what that state 
is. Now, you can determine some property of the state but any measurement that you 
make will alter the state. This means that you can’t then go back and ask different 
questions about the initial state. 


Our inability to know everything about a state is one of the key tenets of quantum 
mechanics. But there’s an obvious way around it. Suppose that we could just copy the 
initial state many times. Then we could ask different questions on each of the replicas 
and, in this way, build up a fuller picture of the original state. The no cloning theorem 
forbids this. 
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To prove the theorem, we really only need to set up the question. We start with a 
state |Y) € Ha. Suppose that we prepare a separate system in a blank state |0) € Hz. 
To create a copy of the initial state, we would like to evolve the system so that 


[In()) = |e) 8 |0) — [Out(y)) = |) 8 |) 


But this can’t happen through any Hamiltonian evolution because it is not a unitary 
operation. To see this, consider two different states |q1) and |w2). We have 


(In(1)|In(v2)) = (Wilde) while (Out(y1)|Out(%2)) = (Pilha) 


We might try to wriggle out of this conclusion by allowing for some other stuff in the 
Hilbert space which can change in any way it likes. This means that we now have three 
Hilbert spaces and are looking an evolution of the form 


|) 810) 8 la(0)) — |) 8 |) @ la) 


By linearity, if such an evolution exists it must map 


(1¢) + |b) 8 |0) 8 la(0)) — |) 8 6) @ la(g)) + W) 8 Y) 8 la(H)) (5.15) 


But this isn’t what we wanted! The map is supposed to take 


(le) + |)) 8 |0) 8 la(0)) — (16) + 1Y) 8 (6) + Iv) @ la + 4) 
= (1A) + ld) |e) + Io) 1) + yl) S lay + p) 


where, in the last line, we dropped the ® between the first two Hilbert spaces. The 
state that we get (5.15) is not the state that we want. This concludes our proof of the 
no cloning theorem. 


Back to Teleportation 


With the no cloning theorem as background, we can now turn to the idea of quantum 
teleportation. Alice is given a qubit in state |). The challenge is to communicate this 
state to Bob. 


There are two limitations. First, Alice doesn’t get to simply put the qubit in the mail. 
That’s no longer the game. Instead, she must describe the qubit to Bob using classical 
information: i.e. bits, not qubits. Note that we’re now playing by different rules from 
the previous section. In “dense coding” we wanted to send classical information using 
qubits. Here we want to send quantum information using classical bits. 
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Now this sounds like teleportation must be impossible. As we’ve seen, Alice has no 
way of figuring out what state |W) she has. If she doesn’t know the state, how on earth 
is she going to communicate it to Bob? Well, magically, there is way. For this to work, 
Alice and Bob must also share an EPR pair. We will see that they can sacrifice the 
entanglement in this EPR pair to allow Bob to reproduce the state |). 


First, Alice. She has two qubits: the one we want to transfer, |~), together with the 
her half of the pair |E PR). She makes the following measurements: 


O,®o0, and o; 80; 
where, in each case, the first operator acts on |w) and the second on her half of |E PR). 


As we saw in the previous section, these are commuting operators, each with eigen- 
values +1. This means that there are four different outcomes to Alice’s experiment 
and the state will be projected onto the eigenstates |=} or |y*) defined in (5.13). The 
different possible outcomes of the measurement were given in (5.14). 


Let’s see what becomes of the full state after Alice’s measurements. We write the 
unknown qubit |w) as 


IY) = alt) + BIL) (5.16) 


with |a|? + |8|? = 1. Then the full state of three qubits — two owned by Alice and one 
by Bob — is 


Is) 8 EPR) = => (APD — alt) + BILI) = BIL)ILIt)) 


(allo) +16 PIL) = allat) + be It) 
+B(Ix*) = ODIL) = BUS") = 1) ITY) 

= 5(lo*)(-8It) + aL) + 167X611) + alL)) 

HLx*)(-alt) + B1L)) = 7 )(alt) + B14))) 


When Alice makes her measurement, the wavefunction collapses onto one of the four 
eigenstates |$=} or |x=}. But we see that Bob’s state — the final one in the wavefunction 
above — has taken the form of a linear superposition of |) and ||), with the same 
coefficients a and 8 that characterised the initial state |w) in (5.16). Now, in most of 
these cases, Bob’s state isn’t exactly the same as |Y}, but that’s easily fixed if Bob acts 
with a unitary operator. All Alice has to do is tell Bob which of the four states she 
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measured and this will be sufficient for Bob to know how he has to act. Let’s look at 
each in turn. 


e If Alice measures |") then Bob should operate on his qubit with o, to get 
oy(—B|t) + a|L)) = ibil) + ialt) = ily) 


which, up to a known phase, is Alice’s initial state. 


e If Alice measures |7} then Bob should operate on his qubit with cz, 
o2(8|t) +alL)) =L) + alt) = |) 
e If Alice measures |x") then Bob should operate on his qubit with ø+, 


o2(8|t) + alL)) = Ll4) + alt) = |) 


e If Alice measures |t}, Bob can put his feet up and do nothing. He already has 
—|q) sitting in front of him. 


We see that if Alice sends Bob two bits of information — enough to specify which of 
the four states she measured — then Bob can ensure that he gets state |W). Note that 
this transfer occurred with neither Alice nor Bob knowing what the state |Y} actually 
is. But Bob can be sure that he has it. 


5.2.4 Quantum Key Distribution 


If you want to share a secret, it’s best to have a code. Here is an example of an 
unbreakable code. Alice and Bob want to send a message consisting of n classical bits, 
a string of 0’s and 1’s. To do so securely, they must share, in advance, a private key. 
This is a string of classical bits that is the same length as the message. Alice simply 
adds the key to the message bitwise (0+0=1+1=0and0+1=1+0=1) before 
sending it to Bob who, upon receiving it, subtracts the key to reveal the message. Any 
third party eavesdropper — traditionally called Eve — who intercepts the transmission 
is none the wiser. 


The weakness of this approach is that, to be totally secure, Alice and Bob, should 
use a different key for each message that they want to send. If they fail to do this then 
Eve can use some knowledge about the underlying message (e.g. it’s actually written in 
German and contains information about U-boat movements in the Atlantic) to detect 
correlations in the transmissions and, ultimately, crack the code. This means that Alice 
and Bob must have a large supply of private keys and be sure that Eve does not have 
access to them. This is where quantum mechanics can be useful. 
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BB84 


BB84 is a quantum protocol for generating a secure private key. It’s named after its 
inventors, Bennett and Brassard, who suggested this approach in 1984. 


The idea is remarkably simple. Alice takes a series of qubits. For each, she chooses 
to measure the spin either in the z-direction, or in the x-direction. This leaves her with 
a qubit in one of four possible states: |f}, | 4Y, |) or |<-). Alice then sends this 
qubit to Bob. He has no idea which measurement Alice made, so he makes a random 
decision to measure the spin in either the z-direction or the x-direction. About half 
the time he will make the same measurement as Alice, the other half he will make a 
different measurement. 


Having performed these experiments, Alice and Bob then announce publicly which 
spin measurements they made. Whenever they measured the spin in different direc- 
tions, they simply discard their results. Whenever they measured the spin in the same 
direction, the measurements must agree. This becomes their private key. 


The whole purpose of generating a private key 


te Armee-Stabs-Maschinenschliissel Nr. 28 


is that it must be private. For example, the keys T = 


for the enigma machine — as shown in the picture 
— were sent out monthly. If you were lucky enough 


| 
to capture this book, you could break the codes for 3 
the next month. How can Alice and Bob be certain T Tete H ki 
that their key hasn’t been intercepted by Eve? ; Si 


This is where the laws of quantum physics come 
to the rescue. First, the no-cloning theorem ensures Figure 29: 
that Eve has no way of copying the qubit if she 
intercepts it. Nor does she have any way of determining its state. Even if she knows 
the game that Alice and Bob are playing, the best that she can do is to measure the 
spin in either the z-direction or the x-direction, before sending it on to Bob. Half the 
time, she will make the same measurement as Alice and leave the state unchanged. But 
the other half, she will change the state and so change the possible results that Bob 
finds in his measurements. To guard against this possibility, Alice and Bob can simply 
choose to publicly announce a subset of the results of their correlated measurements. 
If they don’t perfectly agree, then they know that someone has tampered with the 
transmission. 
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The BB84 protocol doesn’t make any use of quantum entanglement. There is, how- 
ever, a minor variation where entanglement plays a role. In this scenario, Alice prepares 
a succession of entangled pairs in, say, the state 

1 
V2 
She then sends the second spin to Bob. When the two of them both have their spins, 
they can follow the BB84 rules to generate the key. The slight advantage of this 


IO") = SIT) + 14)14)) 


approach is that Alice doesn’t have to record her measurements before sending them 
to Bob. This protects her from the possibility that someone breaks into her lab and 
takes sneaky photographs of her measurement results. Of course, one might wonder if 
the extra resources involved in generating coherent entangled states might not be put 
to better use in, for example, buying a decent safe. 


The moral behind quantum key distribution is clear: quantum information is more 
secure than classical information because no one, whether friend or enemy, can be sure 
what quantum state they’ve been given. 


5.3 Density Matrices 


In Section 5.1, we’ve made a big deal out the fact that quantum correlations cannot 
be captured by classical probability distributions. In the classical world, uncertainty 
is due to ignorance: the more you know, the better your predictions. In the quantum 
world, the uncertainty is inherent and can’t be eliminated by gaining more knowledge. 


There are situations in the quantum world where we have to deal with both kinds 
of uncertainties. There are at least two contexts in which this arises. One possibility 
is ignorance: we simply don’t know for sure what quantum state our system lies in. 
Another possibility is that we have many quantum states — an ensemble — and they 
don’t all lie in the same state, but rather in a mixture of different states. In either 
context, we use the same mathematical formalism. 


Suppose that we don’t know which of the states |7;) describes our system. These 
states need not be orthogonal — just different. To parameterise our ignorance, we assign 
classical probabilities p; to each of these states. The expectation value of any operator 
A is given by 


(A) = X pilil Alt) (5.17) 
This expression includes both classical uncertainty (in the p;) and quantum uncertainty 


(in the (7;|A]7/;)). 
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Such a state is described by an operator known as the density matriz. 
P= X pili) (il (5.18) 


Clearly, this is a sum of projections onto the spaces spanned by |w;), weighted with 
the probabilities p;. The expectation value (5.17) of any operator can now be written 
simply as 


(A) = Tr(pA) 
where the trace is over all states in the Hilbert space. 


Pure States vs Mixed States 


Previously, we thought that the state of a quantum system is described by a normalised 
vector in the Hilbert space. The density matrix is a generalisation of this idea to 
incorporate classical probabilities. If we’re back in the previous situation, where we 
know for sure that the system is described by a specific state |W), then the density 
matrix is simply the projection operator 


p= ov 


In this case, we say that we have a pure state. If the density matrix cannot be written 
in this form then we say that we have a mized state. Note that a pure state has the 
property that 

p =p 


Regardless of whether a state is pure or mixed, the density matrix encodes all our 
information about the state and allows us to compute the expected outcome of any 
measurement. Note that the density matrix does not contain information about the 
phases of the states |7;) since these have no bearing on any physical measurement. 


Properties of the Density Matrix 
The density matrix (5.18) has the following properties 


e It is self-adjoint: p = p! 


e It has unit trace: Trp = 1. This property is equivalent to the normalisation of a 
probability distribution, so that `; p; = 1. 
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e It is positive: (¢|p|¢) > 0 for all |} € H. This property, which strictly speaking 
should be called “non-negative”, is equivalent to the requirement that p; > 0. As 
shorthand, we sometimes write the positivity requirement simply as p > 0. 


Furthermore, any operator p which satisfies these three properties can be viewed as a 
density matrix for a quantum system. To see this, we can look at the eigenvectors of 
p, given by 


Plon) = Pn|on) 


where, here, pn is simply the corresponding eigenvalue. Because p = p', we know that 
pn E R. The second two properties above then tell us that 5°, pz = 1 and pn > 0. 
This is all we need to interpret p, as a probability distribution. We can then write p 
as 


P= > Palbn) nl (5.19) 


This way of writing the density matrix is a special case of (5.18). It’s special because 
the |Øn) are eigenvectors of a Hermitian matrix and, hence, orthogonal. In contrast, the 
vector |~;) in (5.18) are not necessarily orthonormal. However, although the expression 
(5.19) is special, there’s nothing special about p itself: any density matrix can be written 
in this form. We’ll come back to this idea below when we discuss specific examples. 


An Example: Statistical Mechanics 
There are many places in physics where it pays to think of probability distributions 


over ensembles of states. One prominent example is what happens for systems at finite 
temperature T. This is the subject of Statistical Mechanics. 


Recall that the Boltzmann distribution tells us that the probability p,, that we sit in 
an energy eigenstate |n) is given by 
e7 bEn 


Z 


1 
= where 8 = kat and Z = Ae 


where kg is the Boltzmann constant. It is straightforward to construct an density 
matrix corresponding to this ensemble. It is given by 


eee 
= 5.20 
p= -z3 (5.20) 
where H is the Hamiltonian. Similarly, the partition function is given by 
Z = Tref” 


It is then straightforward to reformulate much of statistical mechanics in this language. 
For example, the average energy of a system is (E) = Tr(pH). 
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In these lectures, we won’t necessarily be interested in the kind of macroscopic sys- 
tems that arise in statistical physics. Instead, we’ll build some rather different intuition 
for the meaning of the density matrix. 


Time Evolution 


Recall that in the Schrodinger picture, any state evolves as 
Y(t) =U@|W0)) with U(E) =e" 
From this we learn that the density matrix evolves as 
p(t) = U(t)p(0)U'(t) 
Differentiating with respect to t gives us a differential equation governing time evolu- 


tion, 


£ IH, o] (5.21) 


This is the Liouville equation. Or, more accurately, it is the quantum version of the 
Liouville equation which we met in the Classical Dynamics lectures where it governs 
the evolution of probability distributions on phase space. 


Note that any density operator which depends only on the Hamiltonian H is inde- 
pendent of time. The Boltzmann distribution (5.20) is the prime example. 


5.3.1 The Bloch Sphere 


As an example, let’s return to our favourite two-state system. If we measure spin along 
the z-axis, then the two eigenstates are |f} and ||). 


Suppose that we know for sure that we’re in state |f). Then, obviously, 


P =| 


If however, there’s probability p = 5 that we're in state | +) and, correspondingly, 
probability 1 — p = į that we’re in state ||), then 


p= iae slyu=51 (5.22) 


This is the state of maximum ignorance, something we will quantify below in Section 
5.3.3. In particular, the average value for the spin along any axis always vanishes: 
(o) = Trl po) = 0. 
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Let’s now consider other spin states. Consider the spin measured along the x-axis. 
Suppose that there’s probability p = i that we’re in state |) and probability 1—p = i 
that we’re in state |<), then 


p= ije] = 52 (5.23) 


Once again, we find a state of maximum ignorance. This highlights an important fact: 
given a density matrix p, there is no unique way to decompose in the form (5.18). 


As a final example, there is nothing to stop us taking an ensemble of non-orthogonal 


states. So we could be in state | +) with probability p = $ and in state |) with 


probability p = z. The resulting density matrix is 


p= Ene TF L> | 


= Sit + EIT) + tl + 
E 14! Slt yeni + Git yu + EILA 


We haven’t written this density matrix in the form (5.19), although its not difficult to 
do so. Nonetheless, it’s simple to check that it obeys the three conditions above. We 
find {ot} = (o°?) = 1/2 and (o°) = 0. 


Let’s now look at the most general density matrix for a two-state system. The most 
general Hermitian 2 x 2 matrix can be expanded in terms of 1 and the Pauli matrices 
gt. Since Tr1 = 2 and Tro’ = 0, the requirement that Trp = 1 means that we can write 


p=5(1ta-o) (5.24) 


for some 3-vector a. All that’s left is to require that this matrix has positive eigenvalues. 
The sum of the two eigenvalues is given by Trp = 1, so at least one of them must be 
positive. The product of the eigenvalues is given by det p. It’s simple to compute 


1 
det p = 7(1—a-a) 


The two eigenvalues are both non-negative if det p > 0. We learn that (5.24) defines a 
density matrix for a two-state system if 


jal <1 


This is the interior of a 3-sphere which should be called the Bloch Ball. Unfortunately 
the names are a little mixed-up and this interior is sometimes referred to as the Bloch 
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Sphere. The interior of the ball, with |a| < 1, describes mixed states. The surface of 
the ball with |a| = 1 — which should really be called the Bloch Sphere — describes 
pure states. 


For both mixed and pure states, the direction a is referred to as the polarisation of 
the spin. For a Æ 0, there will be a preference for the measurements of spin in the 
direction a- ø. In contrast, when a = 0, the state is said to be unpolarised. We met 
two examples of this above. 


The Ambiguity of Preparation 


The are typically many different interpretations of a density matrix. We’ve seen an 
example above, where two different probability distributions over states (5.22) and 
(5.23) both give rise to the same density matrix. It’s sometimes said that these density 
matrices are prepared differently, but describe the same state. 


More generally, suppose that the system is described by density matrix pı with some 
probability A and density matrix pə with some probability (1 — A). The expectation 
value of any operator is determined by the density matrix 


pA) = Apr + (1 — A) po 


Indeed, nearly all density operators can be expressed as the sum of other density 
operators in an infinite number of different ways. 


There is an exception to this. If the density matrix p actually describes a pure state 
then it cannot be expressed as the sum of two other states. 


5.3.2 Entanglement Revisited 


The density matrix has a close connection to the ideas of entanglement that we met in 
earlier sections. Suppose that our Hilbert space decomposes into two subspaces, 


H=H,®He 


This is sometimes referred to as a bipartite decomposition of the Hilbert space. It really 
means that #4 and Hpg describe two different physical systems. In what follows, it will 
be useful to think of these systems as far separated, so that they don’t interact with 
each other. Nonetheless, as we’ve seen in Section 5.1, quantum states can be entangled 
between these two systems, giving rise to correlations between measurements. 
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Let’s consider things from Alice’s perspective. She only has access to the system 
described by Ha. This means that she gets to perform measurements associated to 
operators of the form 


O=A@l 


If the state of the full system is described by the density matrix p4g, then measurements 
Alice makes will have expectation value 


(A) = Trn, Tens ((A® Ipan) = Trn, (A pa) 
where we’ve defined 


pa = Trup PAB 


This is called the reduced density matrix. It is related to the full density matrix by taking 
the partial trace over the Hilbert space Hg. We see that, from Alice’s perspective, the 
part of the system that she has access to is described by the density matrix p4. 


Suppose that the full system pasg lies in a pure state. This means that it takes the 
form 


m= > oló) ® lés) (5.25) 


where we’ve introduced a basis |¢;) for H4 and |¢,;) for Hg. (These two Hilbert spaces 
need not have the same dimension.). Note that, in general, this is an example of an 
entangled state. 


The density matrix for the full system is 
pas = |V)(U| = X ayog |b;) 2 le) erl 2 (Al 
i,j kyl 
Taking the partial trace then gives the reduced density matrix 
pa =X Bixldi) (Gel with Bin = X ajar, 
ik j 
But this is the density matrix for a mixed state. This means that even if the full system 
is in a pure state, as far Alice is concerned it effectively lies in a mixed state. This 
illustrates how the probabilities p; can arise from our lack of knowledge of other parts 


of the system. However, the presence of entanglement in the original state means that 
even ignorance about physics in far flung places forces us to deal with a mixed state. 
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In fact, this approach allows us define entanglement between two subsystems, some- 
thing that we avoided doing in the previous sections. The state |W) is said to be 
entangled only if the reduced density matrix p4 = Tryp |¥) (Y| describes a mixed state. 
Otherwise |W) is said to be separable. We will quantify the amount of entanglement in 
a system in Section 5.3.3 using the concept of entropy. 


EPR Pairs Revisisted 


Let’s return to our favourite example of entanglement between two qubits. The EPR 
state is 

1 
V2 
where, in an attempt to stop us going boggle-eyed in later equations, we’re using 
notation such that |*)|?) = |tt). The associated density matrix is 


JEPR) = UN) — Wt) 


1 
peer = 5 (IN) + Wt) Qt IMUT = Wt) (NI) (5.26) 
We now take the trace over Bob’s spin to get the reduced density matrix for Alice, 
1 1 
pa = Tru,perr = 5 (Itt +14) = 31 (5.27) 


Everything that Alice can measure on her own is captured in p4, which is the state of 
maximum ignorance. We see that although the total density matrix knows about the 
correlations, there’s no way that Alice can know about this on her own. 


To illustrate this, suppose that Bob performs a measurement on his spin. This 
projects the EPR pair into state |{{) with probability p = i and into state |}f) with 
probability p = F. Bob, of course, knows which of these states the system has collapsed 
to. However, if we don’t know the outcome of this measurement then we should describe 


the system in terms of the mixed state 


Picea = ZND + ZHY 


This differs from the EPR density matrix (5.26). However, if we take the trace over 
Bob’s degrees of freedom then we find that Alice’s reduced density matrix p4 is once 
again given by (5.27). This is the statement that nothing changes for Alice when 
Bob performs a measurement. We can also repeat this exercise when Bob performs 
a measurement in a different spin direction. Once again, we find that p, is given by 
(5.27). All of this is telling us something that we already knew: we cannot use the 
non-local correlations inherent in quantum states to transmit information in a non-local 
fashion. 
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Schmidt Decomposition 


Consider a pure state |Y) in H = H4 @ Hg. Given a set of basis |¢;) and |¢;), we 
can always decompose the state in the form (5.25). Moreover, it turns out that there 
is a preferred choice of basis states. The resulting expression is known as the Schmidt 
decomposition. 


First, let’s define a canonical basis for Ha. As we’ve seen above, we can take the 
partial trace over Hpg to find derive the reduced density matrix p4. We’ll choose |¢;) 
to be the eigenvectors of p4, as in (5.19). We can then write 


pa= > Pildi)(drl (5.28) 


Our next task is to construct a suitable basis for Hp. We could, of course, choose 
the basis of pg and, in fact, ultimately this is what we’ll end up doing. But in order 
to illustrate a rather nice property of this decomposition, we'll get there in a slightly 
roundabout way. Given a decomposition of the form (5.25), we define the vectors 


xi) = > ouly) E€ Hg 


Note that nothing guarantees that the vectors |y;) are normalised, and nothing guar- 
antees that they are orthogonal. For now, their only purpose is to allow us to write the 
state (5.25) as 


IT) = X. l) 8 xi) 
j 
Now let’s compute p4 from this state. We have 
PA = X Trusló:) 8 lx Ale xl = X xalxa) COK 
ij tJ 
But we know that this reduced density matrix takes the form (5.28). This means that 
the overlap of the |;) vectors must be 
(xilXa) = Pidig 
We learn that these vectors aren’t normalised but, perhaps surprisingly, they are or- 
thogonal. It’s then straightforward to define a basis of vectors by 


1 
vPi 


Xi) = 


xa) 
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Only those vectors with p; # 0 actually appear so we don’t have to worry about about 
dividing by zero here. The upshot of this is that we can write any pure bipartite state 
in the canonical decomposition 


I) = > vile) & Lk) (5.29) 


This is the Schmidt decomposition. Note that there is a nice symmetry between the 
reduced density matrices p4 and pg. They are, respectively, 


pa = > Pili) (d:l ; PB= > pila) (Xl 


We see that the basis |y;) are the eigenvectors of pg, even though this wasn’t how we 
initially constructed them. Further, the probabilities p; are eigenvalues of both p4 and 
pp. In particular if, say, dimHpg > dimH, then there must be some states in Ha that 
do not appear in the Schmidt decomposition (5.29). 


If the probabilities p; are distinct then the Schmidt decomposition is unique. In 
contrast, if pa has degenerate eigenvalues then there is some ambiguity in the Schmidt 
decomposition, as we get to decide which of the degenerate eigenvectors in Ha pairs 
with their counterpart in Hp. 


The Schmidt rank R is the number of non-zero eigenvalues p; in the decomposition 
(5.29). If R = 1 then the state takes the form 


|Y) = |) 8 |X) 
and is separable. If R > 1, the state is entangled. 


Finally, let’s go back to Alice and Bob. Each gets to act on their subsystem by 
transforming the state they have to any other. This means that, between them, they 
get to act with unitary operators on H = Hy, ® Hp of the form 


U =U, @8Uz 


However, the state |W) and the state U|W) have the same Schmidt rank. This is 
important. It tells us that we cannot change the amount of entanglement by local 
operators which act only on part of the Hilbert space. To create entanglement, we 
need to act with operators which rotate Ha into Hg. In other words, there has to be 
some interaction between the two parts of the subsystem. Entanglement can only be 
created by bringing the two subsystems together. 


a 


Purification 


There is a simple corollary to our discussion above. For any density matrix p describ- 
ing a state in a Hilbert space H4, one can always find a pure state |W) in a larger 
Hilbert space H = Ha ® Hp such that p = Tryp|Y} (Y|. This process is referred to as 
purification of the state. 


Everything that we need to show this is in our derivation above. We write the density 
matrix in the orthonormal basis (5.28). We then introduce the enlarged Hilbert space 
Hpg whose dimension is that same as the number of non-zero p; in (5.28). The Schmidt 
decomposition (5.29) then provides an example of a purification of p. 


5.3.3 Entropy 
Given a classical probability distribution {p;}, the entropy is defined by 


S=- X pilogp: (5.30) 


where log is the natural logarithm. In information theory, this is called the Shannon 
entropy. In physics, this quantity is usually multiplied by the Boltzmann constant kpg 
and is called the Gibbs entropy. It plays an important role in the lectures on Statistical 
Physics. 


The entropy is a measure of the uncertainty encoded in the probability distribution. 
For example, if there’s no uncertainty because, say pı = 1 while all other p; = 0, then 
we have S = 0. In contrast, if there are N possibilities the entropy is maximised when 
we have no idea which is most likely, meaning that p; = 1/N for each. In this case 
S = log N. 


For a quantum state described by a density matrix p, we defined the entropy to be 
S(p) = —Tr plog p (5.31) 


This is the von Neumann entropy (because entropy really needs more names attached 
to it). If we’re dealing with a reduced density matrix, that came from taking a partial 
trace of a pure state of a larger system, then S is referred to as the entanglement 
entropy. In all cases, we’re simply going to call it the entropy. 


When the density matrix is expanded in an orthonormal basis, 
p= > pildi) (dil 


then the definition (5.31) coincides with the earlier definition (5.30). 
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A pure state has p; = 1 for some |¢;), and so has vanishing entropy. But S 4 0 for 
any mixed state. 


The entropy has a number of properties, some of which are easier to prove than 
others. First the properties that are straightforward to show: 


e Positivity: S(p) > 0. 
e Minimum: S(p) = 0 if and only if p is a pure state. 


e Maximum: If the probabilities are non-vanishing on an N dimensional Hilbert 
space Hy, then the entropy takes its maximum value S = log N when p = 41 
on Hy. 


e Concavity: If $` A; = 1, then 
SO Xipi) Z bD XiS (pi) 


This tells us that if we are more ignorant about the make-up of our state, then 
the entropy increases. 


The entropy obeys a number of further properties. Two which are particularly impor- 
tant are: 


e Subadditivity: If H =H, @ Hpg then 


S(pas) < S(pa) + S(pz) (5.32) 


with equality only if the two systems are uncorrelated, so that pag = pa 8 PB. 
Subadditivity tells us that the entropy of the whole is less than the sum of its 
parts. This result fairly straightforward to prove, although we won’t do so here. 


e Strong Subadditivity: If H = Ha 8 Hg ® Hc then 
S(pasc) + S(pp) < S(pas) + S(pBc) 


This result is famously tricky to prove. It’s perhaps best thought of by thinking 
of AB and BC as two systems which overlap on B. Then strong subadditivity 
says that the total entropy of the two parts is not less than the total entropy 
together with the entropy of their overlap. 
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5.4 Measurement 


The act of measurement is one of the more mysterious aspects of quantum mechanics. 
It is here that we appear to abandon unitary evolution in favour of the abrupt collapse 
of the wavefunction, and it is here that we must embrace the indeterministic nature 
of the quantum world. In this section, we’ll take a closer look at what we mean by 
measurement. 


5.4.1 Projective Measurements 


We start by recalling what we learned in previous courses. An observable in quantum 
mechanics is a Hermitian operator O. We can decompose this in a spectral representa- 
tion, meaning we write 


OF > ihn (5.33) 


where Am are the eigenvalues of O and Pn are the projectors onto the corresponding 
eigenspaces. The projection operators obey Pm = P!,. The eigenspaces are necessarily 
orthogonal, meaning 


Moreover, the eigenvectors span the entire Hilbert space, so we also have 


Pea (5.35) 


Given a state |}, the result of a measurement in quantum mechanics is dictated by 
two, further axioms. The first says that a measurement of the operator O returns the 
result Am with probability 


p(m) = (| Priv) (5.36) 
This is the Born rule. 


The axiom states that after the measurement, the system no longer sits in the state 
|W). Instead, the act of measurement has disturbed the state, leaving it in the new 
state 


(5.37) 


where the ,/p(m) in the denominator is there to ensure that the resulting state is 
correctly normalised. The non-unitary evolution captured by (5.37) is the infamous 
collapse of the wavefunction. 


=i 


There are a couple of simple generalisations of the above formalism. First, suppose 
that we start with a mixed state, described by a density matrix p. Then the Born rule 
(5.36) and collapse (5.37) are replaced by 


Pap m 
p(m) 


p(m)=Tr(pPn) and p> (5.38) 


Note, in particular, that the resulting density matrix still has unit trace, as it must to 
describe a state. 


As an alternative scenario, suppose that we don’t know the outcome of the measure- 
ment. In this case, the collapse of the wavefunction turns an initial state |y} into a 
mixed state, described by the density matrix 


vrs) pl mn) NO Pal) b| Pm (5.39) 


If we don’t gain any knowledge after our quantum system interacts with the measuring 
apparatus, this is the correct description of the resulting state. 


We can rephrase this discussion without making reference to the original operator 
O. We say that a measurement consists of presenting a quantum state with a complete 
set of orthogonal projectors {Pm}. These obey (5.34) and (5.35). We ask the system 
“Which of these are you described by?” and the system responds by picking one. This 
is referred to as a projective measurement. 


In this way of stating things, the projection operators take centre stage. The answer 
to a projective measurement is sufficient to tell us the value of any physical observable 
O whose spectral decomposition (5.33) is in terms of the projection operators {Pm} 
which we measured. In this way, the answer to a projective measurement can only 
furnish us with information about commuting observables, since these have spectral 
representations in terms of the same set of projection operators. 


Gleason’s Theorem 


Where does the Born rule come from? Usually in quantum mechanics, it is simply 
proffered as a postulate, one that agrees with experiment. Nonetheless, it is the rule 
that underlies the non-deterministic nature of quantum mechanics and given this is 
such a departure from classical mechanics, it seems worth exploring in more detail. 
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There have been many attempts to derive the Born rule from something simpler, 
none of them very convincing. But there is a mathematical theorem which gives some 
comfort. This is Gleason’s theorem, which we state here without proof. The theorem 
says that for any Hilbert space H of dimension dimH > 3, the only consistent way of 
assigning probabilities p(m) to all projection operators P,,, acting on H is through the 
map 


p(m) = Tr(pPm) 


for some self-adjoint, positive operator p with unit trace. Gleason’s theorem doesn’t 
tell us why we’re obliged to introduce probabilities associated to projection operators. 
But it does tell us that if we want to go down that path then the only possible way to 
proceed is to introduce a density matrix p and invoke the Born rule. 


5.4.2 Generalised Measurements 


There are circumstances where it is useful to go beyond the framework of projective 
measurements. Obviously, we’re not going to violate any tenets of quantum mechanics, 
and we won’t be able to determine the values of observables that don’t commute. 
Nonetheless, focussing only on projection operators can be too restrictive. 


A generalised measurement consists of presenting a quantum state with a compete set 
of Hermitian, positive operators {Em} and asking: “Which of these are you described 
by?”. As before, the system will respond by picking one. 


We will require that the operators Em satisfy the following three properties: 
e Hermitian: Em = E}, 
e Complete: 5°, Em = 1 
e Positive: (Y|Em|Y} > 0 for all states |Y). 
These are all true for projection operators {Pm} and the projective measurements 
described above are a special case. But the requirements here are weaker. In particular, 
in contrast to projective measurements, the number of Em in the set can be larger than 


the dimension of the Hilbert space. A set of operators {Em} obeying these three 
conditions is called a positive operator-valued measure, or POVM for short. 
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Given a quantum state |y), we will define the probability of finding the answer Em 
to our generalised measurement to be 


p(m) = (Y| Eml) 
Alternatively, if we are given a density matrix p, the probability of finding the answer 
Em is 

p(m) = Tr (pEm) (5.40) 


At the moment we will take the above rules as a definition, a generalisation of the 
usual Born rule. Note, however, that the completeness and positivity requirements 
above ensure that p(m) define a good probability distribution. Shortly we will see how 
this follows from the more familiar projective measurements. 


An Example: State Determination 


Before we place generalised measurements in a more familiar setting, let’s first see how 
they are may be useful. Suppose that someone hands you a qubit and tells you that 
it’s either |f} or it’s |>) = ([t) +|1))/V2. How can you find out which state you’ve 
been given? 


The standard rules of quantum mechanics ensure that there’s no way to distinguish 
two non-orthogonal states with absolute certainty. Nonetheless, we can see how well 
we can do. Let’s start with projective measurements. We can consider the set 


P=lt)t » P= 


If the result of the measurement is Pı then we can’t say anything. If, however, the 
result of the measurement is P) then we must have been handed the state |—} because 
the other state obeys P|) = 0 and so has vanishing probability of giving the answer 
P>. This means that if we’re handed a succession of states | f} and |), each with 
equal probability, then we can use projective measurements to correctly identify which 
one we have 25% of the time. 


Generalised measurements allow us to do better. Consider now the set of operators 
1 1 
E = IL Ba = E] » Beal F-E» (5.41) 


If the result of the measurement is Æ, then we must have been handed the state |—). 
But, by the same argument, if the result of the state is Ey then we must have been 
handed the state |f}. Finally, if the result is Æ then we’ve got no way of knowing 
which state we were handed. The upshot is that if we’re handed a succession of states 
|T) and |), we can use generalised measurements to correctly identify which one we 
have 50% of the time. 
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Generalised Measurements are Projective Measurements in Disguise 


The generalised measurements are not quite as novel as they first appear. They can 
always be realised as projective measurements in disguise, where the disguise in question 
involves some hidden, larger Hilbert space. 


Let’s first consider our POVM (5.41). Suppose that when we were handed the states 
|+) and |>}, they were actually the first in a pair of qubits, whose full states were 
given by 


Yi) =|T) @|T) and |W2) = |) @ |4) (5.42) 
Now these states are orthogonal to each other and, therefore, distinguishable. 


We will suppose that the density matrix in the full Hilbert space is separable, meaning 
P= P1® pz. Someone — say, Alice — who has access to both spins can perform projective 
measurements in the full four-dimensional Hilbert space, with the resulting probabilities 


p(m) = Try, Te (PPm) 


What about Bob, who has access only to the first spin? Written in terms of operators 
acting on the first qubit, we have 


p(m) = Try (p1Em) where Em = Try,(p2Pm) (5.43) 


Here the operators Em form a POVM on Hı, the Hilbert space of the first qubit. Both 
positivity and completeness follow from the properties of the density matrix pə and the 
projection operators P,,. For example, completeness comes from 


Y Em = Try, (2 X Pm) = Tru (12 8 pa) =1, 


We learn that the formalism of generalised measurements allows Bob to reproduce any 
information that pertains only to the first spin. This is sensible because the original 
density matrix p = pı © p2 was separable, which means that there will be no hidden 
correlations between the two spins that Alice has access to, but Bob does not. 


There are different ways to arrive at the particular POVM (5.41). For example, we 
could consider the situation where we have maximal ignorance about the second spin, 
SO p2 = ilo. Then we can then consider the projectors 


P, = |Y) (Y| ) P, = |W2) (42| ; P, = 14 — P — Fo 


In this case, the POVM defined by (5.43) coincides with (5.41). 


Sth = 


It should be clear that the construction leading to the POVM (5.43) holds more 
generally than our two-state system. A projective measurement in any Hilbert space 
Hı © Ho reduces to a POVM when taken on separable density matrices. In fact that 
converse is also true: any POVM can be realised by projection operators acting on a 
larger Hilbert space. This follows from a fairly simple result in linear algebra known as 
Naimark’s dilatation theorem (sometimes transliterated from the Russian as Neumark’s 
theorem.) 


5.4.3 The Fate of the State 


The projective measurements that we met in Section 5.4.1 have two ingredients. The 
first is the probability that a given result occurs; the second is the fate of the state 
after the measurement 


Po pF 


p(m) = Tr(pP,) and pr On 


(5.44) 


For our generalised measurements, we have explained how the probabilities are replaced 
by p(m) = Tr(pEm). But what happens to the state after the measurement? 


We could try to take inspiration from thinking about generalised measurements in 
terms of projection operators in an enlarged Hilbert space. We know that 


Paps 
p(m) 


But there’s no simple way of writing this in terms of the elements of the POVM 
Em = Tru, (p2Pm). And this is for good reason: the POVM does not include enough 
information to tell us the fate of the state. 


rae ma 


p = p1 8 p2 > => p> Try 
p(m) . 


Instead, we have to define a “square-root” of Em. This is an operator Mm such that 


Y MM = Em (5.45) 


m 


The Mm need not be Hermitian. Furthermore, these operators are not uniquely de- 
termined by (5.45): any unitary transformation Mm > UM,, still obeys (5.45). The 
completeness of the POVM means that they obey 


So MLM, =1 
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The choice of M,, is the extra information we need to specify the state after a 
generalised measurement. If we perform a generalised measurement and find the answer 
Em, then the state becomes 


MmpM}, 


nen (5.46) 


pr 


This new density matrix is Hermitian and has unit trace, as it must. 


A full generalised measurement — one in which both the probabilities and the end 
state are known — is specified by the set of operators {Mm}, such that Em = M}, Mm 
form a POVM. The generalised measurement reduces to the projective measurement 
of Section 5.4.1 only when Mm are orthogonal projection operators. 


Finally, note that if we make a measurement, but don’t know the result, then the 
resulting density matrix is not given by (5.46), but instead by 


p++ >> MmpM}, (5.47) 


m 


This generalises our result (5.39) for projective measurements. 


Repeated Measurements 


The special class of projective measurements enjoys some nice properties that are not 
shared by their generalised counterparts. Perhaps the most prominent is what happens 
upon repeated measurements. 


For projective measurements, if we get a result P,, the first time round, then any 
subsequent measurement is guaranteed to give the same result. This result is familiar 
from our earlier courses on quantum mechanics: if you measure the spin of a particle 
to be up then, as long as the particle is left alone, its spin will continue to be up next 
time round. 


This property doesn’t hold for generalised measurements. Returning to our POVM 
(5.41), a measurement of E; in the first round does not preclude a measurement of Es 
or E the next time round. 


An Example: Detecting a Photon 


The idea of generalised measurement is useful even when our POVM consists of pro- 
jection operators. A standard example is the detection of a photon. Before the mea- 
surement takes place, either the photon exists |1) or it doesn’t |0). 


Sia 


A projective measurement (5.44) would tell us that if we detect a photon, then 
it’s there to detect again on our next measurement. But that’s not what happens. 
Typically when we detect a photon, the photon doesn’t live to tell the tale. Instead, it 
is destroyed in the process. This means that whether a photon is seen or not, the end 
result is always the same: no photon |0). In terms of our new generalised measurements, 
this can be simply described by the operators 


which corresponds to the POVM 


E, = MİM; =|0)(0| and E= MIM, =|) 


In this case, the POVM consists of projection operators. But the collapse of the 
wavefunction (5.46) differs from the usual projective measurement. Regardless of the 
outcome of the initial experiment, if you now try to repeat it the photon will not be 
there. 


5.5 Open Systems 


In this section we will again consider situations where the full Hilbert space decomposes 
into two parts: H = Hs ® Hg. However, we will no longer think of these subspaces as 
the far-separated homes of Alice and Bob. Instead, Hs will denote the system that we 
want to study, and Hpg will denote the surrounding environment. 


Here the environment is typically a vast Hilbert space which we have no way of 
understanding completely. In this sense, it plays a similar role to the thermal baths 
that we introduce in statistical physics. When performing an experiment on a quantum 
system, much of the challenge is trying to shield it from the environment. However, in 
many cases this is not possible and there will be coupling between Hs and Hg. We 
then say that Hs is an open system. The purpose of this section is to understand how 
such open quantum systems behave. 


5.5.1 Quantum Maps 


We will assume that the combined system+environment is described by a pure state 
IY). We’ve seen in Section 5.3.2 that, after tracing over Hp, the system we care about 
is typically described by a reduced density matrix 


p= Ternat T] 


We would like to understand how this density matrix evolves. 
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The state |W) evolves by a unitary operator U (t) acting on the full Hilbert space H. 
The story that we are about to tell only works if, at time t = 0, the two systems lie in 
a separable state, 


[Yo = |b) @ |x) (5.48) 


This means that the original density matrix pọ = |Y} (y| describes a pure state on Hs. 
We now look at how this density matrix evolves. We have 


p(t) = Trug U(t)|Vo) (YoU E) = X (mU (4) Wo) Vol" (Im) 


with |m) a complete basis for Hz. This encourages us to define a set of operators on 
Hs, given by 


Mm(t) = (MUON) = Tres (U()Ix) (ma) (5.49) 
The unitarity of U (t) translates into a completeness condition on the operators Mm(t), 


SO MI (t)Mmn(t) = EAO (mIU (AI) = 1 


m 


We see that the original density matrix on Hgs evolves as 
p(t) = >> Mn(t) po M}, (t) (5.50) 


In general, this will describe the evolution from a pure state to a mixed state. This 
evolution is not, in general, reversible. 


A quick comment: this evolution takes the same general form as the measurement 
process (5.47), at least if we don’t gain any knowledge about the result of the measure- 
ment. This is not coincidence. A measuring apparatus is a macroscopic system that 
becomes entangled with the quantum state. In this sense, it plays a similar role to the 
environment in the discussion above. 


In contrast, if we read off the result of a measurement, then the resulting state is 
described by (5.46); this does not take the form (5.50). 


Kraus Representation Theorem 


Above, we have derived the evolution (5.50) in a rather simple example. However, it 
turns out that this form has more general applicability. Consider a density operator 
Hs which evolves by the map 


p > Lio 
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Such a map is sometimes called a superoperator (because it maps operators to operators, 
rather than states to states). We will require some special properties of our map, most 
of which are inherited from the properties of the density matrices listed in Section 5.3 


e Linearity: Clap; + bp2] = aL[p.] + bL[py]- 
e Hermiticity Preserving: p =p! => Lip] = Lip|'. 
e Trace Preserving: Tr Lip] = Tr p. 


e Complete Positivity. This one requires some explanation. It is natural to insist 
that the map is positive, so that Llp] > 0 whenever p > 0. However, this is 
not sufficient. Instead, we require the stronger statement that the map L® 1g 
is positive on any extension of the Hilbert space Hs to Hs ® Hg. This is the 
statement of complete positivity. It ensures that the map £® 1, will take a valid 
density matrix on the composite system to another density matrix. 


A superoperator obeying these conditions is called a trace preserving, completely pos- 
itive (TPCP) map, with the first two conditions taken for granted. In the quantum 
information community, this map is referred to as a quantum channel. 


The Kraus representation theorem (which we do not prove here) states that any 
quantum map, obeying the four conditions above, can be written as 


Lip]=S>MnppMi, with X` M}, Mm=1 (5.51) 


In this framework, the Mm are called Kraus operators. They are not unique. The 
number of Kraus operators in the quantum map does not exceed dim(Hg)’. 


You might wonder why the collapse of the wavefunction (5.46) fails to take the Kraus 
form (5.51). It is because the map is not linear: the probability p(m) which normalises 
the resulting density matrix itself depends on p through (5.40). 


5.5.2 Decoherence 


In this section, we explore some simple examples of quantum maps. We’ll use these toy 
models to highlight some important and general features that emerge when quantum 
systems interact with an environment. 
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Phase-Damping 


We will take the quantum system Hs to be our trusty qubit. Meanwhile, we will model 
the environment Hpg by a three-state system, spanned by |0}, |1) and |2}. Consider the 
following unitary evolution 


U|t) @ |0) = |t) @ (1 — p|0) + vp|)) 

U|L) @ |0) = |L) @ (1 — 0) + vp|2)) (5.52) 
This means that our qubit interacts with the environment with probability p, changing 
the initial state |0) into either |1) or |2) depending on the state of the qubit. Note, 
however, that the state of the qubit is unchanged by this interaction. So this model 


describes a system in which the energies needed to change the qubit are substantially 
larger than those needed to change the environment. 


If you want a specific picture in mind, you could think of the qubit as a simplified 
model for a heavy dust particle which, in this case, can only sit in one of two positions 
|T) or ||). The environment could be a background bath of photons which scatter off 
this dust particle with probability p. 


The Kraus operators for this quantum map are easily calculated. Using (5.49), they 
are given by 


My = (0|U|0) = /T—p1 
M; = (1|U|0) = VPITI (5.53) 
Mp = (2|U|0) = VPILI 


which can be checked to obey the required completeness condition >, Mi Mm = 1. 
The state of the qubit, described by a density matrix p, then evolves as 


p +> Lip] =X Mh pMm = (1 — p) p + pl0) (0|p|0)(0| + p11) (1121X411 


1 1. 
= (1— 5p)p + spo" po” 


We can see the essence of this quantum map if we write the density matrix in terms of 


Poo Por Poo (1 — p)po1 
p= œ> 
P10 P11 (1 = P) P10 Pil 


We learn that the off-diagonal components are suppressed by the evolution. It is 
these off-diagonal elements which encode possible superpositions of |f} and ||). The 


components 
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interactions with the environment — or, more precisely, the resulting entanglement 
with the environment — means that these off-diagonal elements are reduced under 
time evolution. This process is known as decoherence; it is the evolution of a pure state 
into a mixed state through interactions with the environment. 


We can get a better sense of this if we look at successive maps. This is a little subtle 
because it’s not obvious when we can apply successive Kraus operators (5.53). We 
will discuss this in more detail in Section 5.5.3, but for now we simply look at what 
happens. 


We define the probability of scattering per unit time to be I. Then, in time ôt, we 
have p= rôt < 1. After a total time t = Not, the off-diagonal terms in the density 
matrix are suppressed by 


(l—p)%=(1-Tt/N)X =e (5.54) 
Suppose that we initially prepare our qubit in a state 


ly) =at) +8) lal? +l = 1 


Then after time t, the density matrix becomes 


7 la? ax ett 
p(t) E es 16|? ) 


We see that these off-diagonal components decay exponentially quickly, with the system 
ultimately settling down into a mixed state. The choice of preferred basis |f}, ||) can 
be traced to the form of the original interaction (5.52) 


To flesh this out a little, let’s return to our interpretation of this model in terms of 
a heavy dust particle which can sit in one of two positions, |?) = |v+) or |4} = |z_). 
We may, of course, choose to place this particle in a superposition 


|b) = ala) + Blx_) 


and hope to measure this superposition in some way. This, of course, is what happens 
in the double-slit experiment. However, decoherence makes this difficult. Indeed, if 
the particle takes time t >> I~! to traverse the double slit experiment then all the hint 
of the superposition will be washed out. Furthermore, I~! is typically a very short 
timescale; it is the rate at which a single photon scatters off the particle. This can be 
much much shorter than the rate at which the classical properties of the particle — say 
its energy — are affected by the photons. 
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There is one final important lesson to take away from this model. It explains why 
the decoherence occurs in the position basis |r) rather than say, (x+) + |a_))/V2. 
This is because the interactions (5.52) are local. 


The locality of interactions is one of the key features of all physical laws; indeed, it 
underlies the idea of quantum field theory. Combined with decoherence, this explains 
why we only see our favourite pets in the state |alive) or |dead). Interactions with the 
environment mean that it is overwhelmingly unlikely to observe Schrodinger’s cat in 


the state |W) = (jalive) + |dead)) /V/2. 


Amplitude Damping 


Our second example will not give us further insight into decoherence, but instead 
provides a simple model for the decay of an excited atom. (A more detailed look at the 
dynamics underling this can be found in Section 4.4.3.) Consider a two-state atomic 
system. If the atom is in the ground state |) then nothing happens, but if atom is 
in the excited state ||) then it decays with probability p emitting a photon in the 
process, so that the environment changes from |0) to |1). This is captured by the 
unitary evolution 


U|t) ® |0) = |1) @ |0) 
U|L) @|0) = ¥1—plL) @ 10) + vp|t) 811) 


The resulting Kraus operators are 


Mo = (O|U 0) = |t)(tl + VIZ PILL] > M = (U10) = VPI | 


This time the quantum map is given by 


— : m a k +ppu fl = pan ) 
P10 P11 vI = ppo (= p)pu 
If, as previously, we can think about performing this map successive time, with the 
probability for decay p related to the lifetime I~! of the excited state through (5.54) 
then we find the time-dependent density matrix given by 


potd—-e py eon 
p(t) = 


_Tt/2 -Tt 
e /2 p10 e pii 


Interestingly, as t — oo, the system ends up in the pure state |f }), regardless of whatever 
superposition or mixed state it started in. On the one hand this is not surprising: it 
is simply the statement that if we wait long enough the atom will surely have decayed. 
Nonetheless, it does provide a simple example in which quantum maps can take a mixed 
state to a pure state. 
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5.5.3 The Lindblad Equation 


Usually in physics, the most powerful way to describe the evolution of a system is 
through a differential equation. For a closed quantum system in a pure state, the 
relevant equation is the Schrödinger equation. For a closed quantum system in a mixed 
state, it is the Liouville equation (5.21) 


where the density matrix p is an operator on Hs. Here we would like to derive the 
analogous equation for an open quantum system, where Hs is also coupled to an envi- 
ronment Hpg. 


It is not at all clear that such an equation will exist. Knowledge of the density 
matrix p on Hs at some time will not, in general, be sufficient to tell you how the 
density matrix will behave in the future. The problem is not just that the environment 
can affect our system — that, after all is what we’re trying to model. The problem is 
more one of memory. 


As time progresses, the system changes the environment. Our concern is that these 
changes accumulate, so that the environment starts to affect the system in different 
ways. In this sense, the environment can act as a memory, where the state of the 
system in the future depends not only on the present state, but on its entire history. 
These kind of situations are complicated. 


We've already seen a hint of this in our earlier work. Recall that when we first 
looked at quantum maps, we assumed that the initial state (5.48) was separable, with 
no correlations between Hs and Hpg. Had we included these correlations, we would not 
have found such a simple, linear quantum map. Yet, such correlations inevitably build 
with time, meaning that we should be careful about performing successive quantum 
maps. This is a manifestation of the memory of the environment. 


To make progress, we will restrict ourselves to situations where this memory does 
not last. We will consider the environment to be vast, similar to the heat reservoirs 
that we use in statistical mechanics. We assume that correlations between the system 
and the environment are lost over a certain time scale. We will denote this time scale 
by 7, and seek an equation which dictates the dynamics of p on timescales t > T. 


Our starting point is the quantum map (5.50), 


p(t + dt) = X Mmlt + ôt) p(t) MÌ, (t + ôt) (5.55) 
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We will take ôt to be small, as if we were dealing with usual calculus of infinitesimals. 
But we should bear in mind that really we want ôt >> 7. For this equation to hold, we 
must have one Kraus operator — say Mọ — to take the form Mp = 1+ O(6t). The 
remaining operators should be Mm ~ O(vôt). We write 


1 1 
“(K —iH)\6t , Mm=—<ImVét m=1,2,... 
a7 Vi 


where both H and K are chosen to be Hermitian matrices. These Kraus operators 


Mj=1+ 


must obey the completeness relation (5.51), 
SO MiMm=1 > 2K+ 50 Lh Lm = O(6t") 
m=0 m=1 


We therefore write 
1 
K=--NS UL, 
ym 


Plugging these expressions into the quantum map (5.55), and keeping only terms of 
order ôt, we get our final result 


Op | 1 
A= = -i[H, Lmnpth, — SEh Lim — =pLl Dm 
~ “i+ | p p= pih, 


This is the Lindblad equation. It should be thought of as a quantum version of the 
Fokker-Planck equation that is described in the lectures in Kinetic Theory. We see that 
the evolution is governed not just by the Hamiltonian H, but also by further Lindblad 
operators Lm which capture the interaction with the environment. The presence of the 
final two terms ensures that d(Tr p)/dt = 0, as it should for a density matrix. 


The Increase of Entropy 


Something particularly nice happens when the Lindblad operators are Hermitian, so 
Lm = LÌ, In this case, the entropy increases. The von Neumann entropy is defined as 
(5.31) 


S(p) = —Tr plog p 


Its change in time is given by 


o = —Tr (Zu + log o) ) = —Tr (SP tos») 


= st = 


Inserting the Lindblad equation, we see that the first term vanishes, courtesy of the 
fact that |p, log p] = 0. We’re left with 


dS 
ħh— = > Tr [(LmpLm — LmLmp) log o] 
To proceed, we decompose the density matrix p in terms of its eigenvectors 
p= X vildi) (il 
and take the trace by summing over the complete basis |¢;). We have 
dS 
hi = ` N (Qil(LmpPLm = LmLmp)|%i) log pi 


= S X (bil Lmld;)(Gj|Lmli) (p; — pi) log pi 


m ij 


= 5 LY Néill)? — p)(log ps — log py) 


m ij 


where, in going to the final line, we took advantage of the anti-symmetric properties of 
the middle line under the exchange of i and j. However, the expression (x — y) (log z — 
log y) is positive for all values of x and y. (This same fact was needed in the proof of 
the H-theorem which is the classical analog of the result we’re deriving here.) We learn 


that 
dS 


— > 
l ON 
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6. Scattering Theory 


The basic idea behind scattering theory is simple: there’s an object that you want to 
understand. So you throw something at it. By analysing how that something bounces 
off, you can glean information about the object itself. 


A very familiar example of scattering theory is called “looking at things”. In this 
section we’re going to explore what happens when you look at things by throwing a 
quantum particle at an object. 


6.1 Scattering in One Dimension 


We start by considering a quantum particle moving along a line. The maths here will 
be simple, but the physics is sufficiently interesting to exhibit many of the key ideas. 


The object that we want to understand is some poten- V(x) 
tial V(x). Importantly, the potential is localised to some | 


region of space which means that V(x) > 0 as x + +oo. E 
An example is shown to the right. We will need the po- NY 
tential to fall off suitably fast in what follows although, 
for now, we won’t be careful about what this means. A Figure 30: 
quantum particle moving along the line is governed by the 
Schrodinger equation, 

h2 dy 


Solutions to this equation are energy eigenstates. They evolve in time as u(x,t) = 
e't/h)(x). For any potential, there are essentially two different kinds of states that 
we're interested in. 


e Bound States are states that are localised in some region of space. The wavefunc- 
tions are normalisable and have profiles that drop off exponentially far from the 
potential 


W(t) ~ el as |x| — 00 
Because the potential vanishes in the asymptotic region, the Schrodinger equation 
6.1) relates the asymptotic fall-off to the energy of the state, 
ymp 8y 
RA? 
E= (6.2) 


Im 


In particular, bound states have Æ < 0. Indeed, it is this property which ensures 
that the particle is trapped within the potential and cannot escape to infinity. 
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Bound states are rather special. In the absence of a potential, a solution which 
decays exponentially to the left will grow exponentially to the far right. But, for 
the state to be normalisable, the potential has to turn this behaviour around, 
so the wavefunction decreases at both x — —oo and x > +00. This will only 
happen for specific values of A. Ultimately, this is why the spectrum of bound 
states is discrete, like in the hydrogen atom. It’s where the name “quantum” 
comes from. 


Scattering States are not localised in space and, relatedly, the wavefunctions are 
not normalisable. Instead, asymptotically, far from the potential, scattering states 
take the form of plane waves. In one dimension, there are two possibilities 


Right moving: yY ~ e** 


Left moving: y ~ e ‘** 


where k > 0. To see why these are left or right moving, we need to put the 


time dependence back in. The wavefunctions then take the form e***-“““/", The 
peaks and troughs of the wave move to the right with the plus sign, and to the left 
with the minus sign. Solving the Schrodinger equation in the asymptotic region 
with V = 0 gives the energy 

h2 k2 


2m 
Scattering states have EF > 0. Note that, in contrast, to bound states, nothing 
special has to happen to find scattering solutions. We expect to find solutions for 
any choice of k. 


This simple classification of solutions already tells us Vix) 
something interesting. Suppose, for example, that the po- 
tential looks something like the one shown in the figure. Patera 
You might think that we could find a localised solution i > 
that is trapped between the two peaks, with E > 0. But 
this can’t happen because if the wavefunction is to be nor- Figure 31: 


malisable, it must have Æ < 0. The physical reason, of 


course, is quantum tunnelling which allows the would-be bound state to escape to 


infinity. We will learn more about this situation in Section 6.1.5. 


6.1.1 Reflection and Transmission Amplitudes 


Suppose that we stand a long way from the potential and throw particles in. What 


comes out? This is answered by solving the Schrodinger equation for the scattering 
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states. Because we have a second order differential equation, we expect that there 
are two independent solutions for each value of k. We can think of these solutions 
physically as what you get if you throw the particle in from the left or in from the 
right. Let’s deal with each in turn. 


Scattering from the Left 


We throw the particle in from the left. When it hits the potential, one of two things 
can happen: it can bounce back, or it can pass straight through. Of course, this being 
quantum mechanics, it can quite happily do both at the same time. Mathematically, 
this means that we are looking for a solution which asymptotically takes the form 


er re eg 
prls) ~ i (6.3) 


tett? zr +00 


We’ve labelled this state Yr because the ingoing wave is right-moving. This can be seen 


in the first term ett” which represents the particle we’re throwing in from z — —oo. The 


second term re~“** represents the particle that is reflected back to x — —oo after hitting 
the potential. The coefficient r € C is called the reflection amplitude. Finally, the term 
tet? at x — +00 represents the particle passing through the potential. The coefficient 
t € C is called the transmission amplitude. (Note: in this formula t is a complex 
number that we have to determine; it is not time!) There is no term e~*** at 2 — +00 
because we’re not throwing in any particles from that direction. Mathematically, we 


have chosen the solution in which this term vanishes. 


Before we proceed, it’s worth flagging up a conceptual point. Scattering is clearly 
a dynamical process: the particle goes in, and then comes out again. Yet there’s no 
explicit time dependence in our ansatz (6.3); instead, we have a solution formed of 
plane waves, spread throughout all of space. It’s best to think of these plane waves as 
describing a beam of particles, with the ansatz (6.3) giving us the steady-state solution 
in the presence of the potential. 


The probability for reflection R and transmission T are given by the usual quantum 
mechanics rule: 


R=|r|? and T= |t|? 


In general, both R and T will be functions of the wavenumber k. This is what we would 
like to calculate for a given potential and we will see an example shortly. But, before 
we do this, there are some observations that we can make using general statements 
about quantum mechanics. 


<9] = 


Given a solution w(x) to the Schrödinger equation, we can construct a conserved 
probability current 


h dy dip* 
J(x) = — *— — 
(2) "Im (v dx ia dx ) 
which obeys dJ/dx = 0. This means that J(x) is constant. (Mathematically, this is 
the statement that the Wronskian is constant for the two solutions to the Schrodinger 
equation). For our scattering solution Wr, with asymptotic form (6.3), the probability 
current as x => —oo is given by 


I(x) = re | Ca a rretike) (cis _ re ik) 4 (cik® cs rei) (ei - as | 
_ hk 2 
= lr) as £ — —00 


Meanwhile, as x + +00, we have 

iin) = Pe ep as £ — +00 
Equating the two gives 

1-|r? =H? > R+T=1 (6.4) 
This should make us happy as it means that probabilities do what probabilities are 


supposed to do. The particle can only get reflected or transmitted and the sum of the 
probabilities to do these things equals one. 


Scattering from the Right 


This time, we throw the particle in from the right. Once again, it can bounce back off 
the potential or pass straight through. Mathematically, we’re now looking for solutions 
which take the asymptotic form 


ve zT —> —OO 
a (6.5) 
et 1 p'ese p +00 


where we’ve now labelled this state wz, because the ingoing wave, at x — +00, is 
left-moving. We’ve called the reflection and transmission amplitudes r’ and t’. 
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There is a simple relation between the two solutions Wr in (6.3) and wy in (6.5). 
This follows because the potential V(x) in (6.1) is a real function, so if Yp is a solution 
then so is Y}. And, by linearity, so is Yh — r*Wr which is given by 


Uh(0) — Wala) ~ (1—|rPe* so 


Pet — te x — +00 


This takes the same functional form as (6.5) except we need to divide through by t* to 
make the normalisations agree. (Recall that scattering states aren’t normalised anyway 
so we're quite at liberty to do this.) Using 1 — |r|? = |¢|?, this tells us that there is a 
solution of the form (6.5) with 


t=t and r =-— (6.6) 


Notice that the transition amplitudes are always the same, but the reflection amplitudes 
can differ by a phase. Nonetheless, this is enough to ensure that the reflection probabil- 


ities are the same whether we throw the particle from the left or right: R = |r|? = |r’/?. 
An Example: A Pothole in the Road 
Let’s compute r and t for a simple potential, given by V(x) 

—a/2 a/2 x 


-Vo -a/2<2<a/2 
veya Ve AR se cal 
0 otherwise | Vo 


with Vo > 0. This looks like a pothole in the middle of an, Figure 32: 
otherwise, flat potential. 


Outside the potential, we have the usual plane waves ùy ~ e****. In the middle of 
the potential, the solutions to the Schrödinger equation (6.1) take the form 


w(x) = Ae” + Be x € |-a/2,a/2] (6.7) 
where 
2mVe 
qg _ z 0 +4 k2 


To compute the reflection and transmission amplitudes, r, r’ and t, we need to patch 
the solution (6.7) with either (6.3) or (6.5) at the edges of the potential. 
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Let’s start by scattering from the left, with the solution (6.3) outside the potential. 
Continuity of the wavefunction at x = +a/2 tells us that 


etka/2 4 neika/2 = Ae~i92/2 4 Beiaa/2 and tette? — Aet 1 Be-*9a/2 


Meanwhile, matching the derivatives of = at x = ta/2 gives 


k Ca _ re™®/2) — Ae 42/2 _ Bei? and Kt ika/2 = Aet _ Be~’? 

q 
These are four equations with four unknowns: A, B, r and t. One way to proceed is 
to add and subtract the two equations on the right, and then do the same for the two 


equations on the left. This allows us to eliminate A and B 


A=ti ¢ a £) eilk—a)a/2 = (1 + =) e k—a)a/2 +r (1 =. =) ellktq)a/2 
q q q 

B=t (: M £) oilk+a)a/2 _ (1 E =) e—ilk+a)a/2 4 > (1 7 =) cilk-a)a/2 
q q q 


We’ve still got some algebraic work ahead of us. It’s grungy but straightforward. Solv- 
ing these two remaining equations gives us the reflection and transmission coefficients 
that we want. They are 


(k? — q°) sin(qa)e**™* 


"= (q? + k?) sin(qa) + 2iqk cos(qa) 
Qigke** 
t= ; 
(q? + k?) sin(qa) + 2iqk cos(qa) Pa 


Even for this simple potential, the amplitudes are far from trivial. Indeed, they contain 
a lot of information. Perhaps the simplest lesson we can extract comes from looking at 
the limit k — 0, where r —> —1 and t > 0. This means that if you throw the particle 
very softly (k — 0), then it won’t make it through the potential; it’s guaranteed to 
bounce back. 


Conversely, in the limit k — oo, we have r = 0. (Recall that q? = k? + 2mVo/h? so 
we also have q — oo in this limit.) By conservation of probability, we must then have 
t| = 1 and the particle is guaranteed to pass through. This is what you might expect; 
if you throw the particle hard enough, it barely notices that the potential is there. 


There are also very specific values of the incoming momenta for which r = 0 and the 
particle is assured of passage through the potential. This occurs when ga = nr with 
n € Z for which r = 0. Notice that you have to fine tune the incoming momenta so 
that it depends on the details of the potential which, in this example, means Vo and a. 
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We can repeat the calculation above for scattering from the right. In fact, for our 
pothole potential, the result is exactly the same and we have r = r’. This arises because 
V(x) = V(—2) so it’s no surprise that scattering from the left and right are the same. 
We'll revisit this in Section 6.1.3. 


6.1.2 Introducing the S-Matrix 


The S-matriz is a convenient way of packaging the information about reflection and 
transmission coefficients. It is useful both because it highlights new features of the 
problem, and because it generalises to scattering in higher dimensions. 


We will start by writing the above solutions in slightly different notation. We have 
two ingoing asymptotic wavefunctions, one from the left and one from the right 


ikg 
; e 
right-moving: Tr(x)=et**® 2 + —co — 
Ingoing 
a e ike 
left-moving: T(z) =e “* 2-4 +00 CORR —— 
Similarly, there are two outgoing asymptotic wavefunctions, 
eikt 
right-moving: Op(r) =et*® g +00 > 
Outgoing 7 
f e” x 
left-moving: Orl) =e x4 -0o ~~ A 


where 


i y 
S= (! r) (6.10) 


This is the S-matriz. As we’ve seen, for any given problem the entries of the matrix 
are rather complicated functions of k. 
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The S-matrix has many nice properties, some of which we will describe in these 
lectures. One of the simplest and most important is that S is unitary. To see this note 
that 


jt? + |r|? tr* + rt 
SS' = 
(a t|? + |r’|? 


Unitarity then follows from the conservation of probability. The off-diagonal elements 
vanish by virtue of the relations t = t and r’ = —r*t/t* that we found in (6.6). Mean- 
while, the diagonal elements are equal to one by (6.4) and so SSt = 1. The equivalence 
between conservation of probability and unitarity of the S-matrix is important, and will 
generalise to higher dimensions. Indeed, in quantum mechanics the word “unitarity” 
is often used synonymously with “conservation of probability”. 


One further property follows from the fact that the wavefunctions Wpr(x) and Yz(x) 
do not change under complex conjugation if we simultaneously flip k + —k. In other 
words w(x; k) = y*(x;—k). This means that the S-matrix obeys 


There are a number of other, more hidden properties of the S-matrix that we will 
uncover below. 


6.1.3 A Parity Basis for Scattering 


As we’ve seen above, for symmetric potentials, with V(x) = V(—2), scattering from 
the left and right is the same. Let’s first make this statement more formal. 


We introduce the parity operator P which acts on functions f(x) as 
Py f(a) f(-2) 


For symmetric potentials, we have [P,H] = 0 which means that eigenstates of the 
Hamiltonian can be chosen so that they are also eigenstates of P. The parity operator 
is Hermitian, P' = P, so its eigenvalues À are real. But we also have P? f(x) = f(z), 
which means that the eigenvalues must obey A? = 1. Clearly there are only two 
possibilities: A = +1 and \ = —1, This means that eigenstates of the Hamiltonian can 
be chosen to be either even functions (A = +1) or odd functions (A = —1). 
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Above we worked with scattering eigenstates Wp and Yz. These are neither odd nor 
even. Instead, for a symmetric potential, they are related by wz(x) = Yr(—z). This is 
the reason that symmetric potentials have r = r’. If we want to work with the parity 
eigenstates, we take 


U(x) = Prlz) + Ur (2) = yrl) + Yr) 
w_(x) = —Yr(x) + yrl) = —vr(z) + YR(-2) 


which obey Pws(x) = +Wi(2). 


Often, working with parity eigenstates makes the algebra a little easier. This is 
particularly true if our problem has a parity-invariant potential, V(x) = V(—z). 


The Pothole Example Revisited 


Let’s see how the use of parity eigenstates can make our calculations simpler. We’ll 
redo the scattering calculation in the pothole, but now we’ll take the asymptotic states 
to be Y, and w_. Physically, you can think of this experiment as throwing in particles 
from both the left and right at the same time, with appropriate choices of signs. 


We start with the even parity wavefunction Y+. We want to patch this onto a solution 
in the middle, but this too must have even parity. This mean that the solution in the 
pothole takes the form 


y(a) = Ale +e) x € [-a/2,a/2 


which now has only one unknown coefficient, A. As previously, q? = k? +2mVo/h?. We 
still need to make sure that both the wavefunction and its derivative are continuous at 
x = +a/2. But, because we’re working with even functions, we only need to look at 
one of these points. At « = a/2 we get 


etka/2 a: (r 4 ‘el? = A(ei42/? ais el?) 
(seer +(r + thet*e/2) = hee T ew iaa/2) 


Notice that only the combination (r + t) appears. We have two equations with two 
unknowns. If we divide the two equations and rearrange, we get 


qtan(qa/2) — ik 


t ES —ika 
ae i qtan(qa/2) + ik 


(6.11) 


which is all a lot easier than the messy manipulations we had to do when working with 
wy and Yr. Of course, we’ve only got an expression for (r + t). But we can play the 
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same game for the odd parity eigenstates to get a corresponding expression for (r — t). 
Now, the solution in the pothole takes the form 


y(x) = B(e!” — e~) x € |-a/2,a/2] 
Requiring continuity of the wavefunction and its derivative at z = a/2 we get 


etka/2 an (r p ner = Bie? _ e7 7192/2) 
ager bps tje) = pee de e~ iaa/2) 


Once again, dividing we find 


_ina d + tk tan(qa/2) 


re q — ik tan(qa/2) 


(6.12) 


It’s not immediately obvious that the expressions (6.11) and (6.12) are the same as 
those for r and t that we derived previously. But a little bit of algebra should convince 
you that they agree. 


[A helping hand: this little bit of algebra is extremely fiddly if you don’t go about 
it in the right way! Here’s a reasonably a streamlined approach. First define the 
denominator of (6.8) as D(k) = (q? +k?) sin(qa) + 2iqk cos(qa). Using the double-angle 
formula from trigonometry, we can write this as D(k) = 2cos?(qa/2)(qtan(qa/2) + 
ik)(q—ik tan(qa/2)). We can then add the two expressions in (6.8), and use the double- 
angle formula again, to get r +t = 2e~"** cos?(qa/2)(q tan(qa/2) — ik) (ik tan(qa/2) — 
q)/D(k) This coincides with our formula (6.11). Similar games give us the formula 
(6.12).] 


The S-Matrix in the Parity Basis 


We can also think about the S-matrix using our new basis of states. The asymptotic 
ingoing modes are even and odd functions, given at |æ| — oo by 


ikx —tka 
parity-even: Line coe ©) -_— 
Ingoing 
— pike e tke 
parity-odd: T_(x) = sign(x) e~**! — f —_— 
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The two asymptotic outgoing modes are 


e 
parity-even: CO. (a) Se" — A0 — 
Outgoing 


parity-odd: O_(x) = —sign(x) etl + & ——= > 
These are related to our earlier modes by a simple change of basis, 


O m aa om (2a) 


We can define an S-matrix with respect to this parity basis. In analogy with (6.9), we 
write asymptotic solutions as 


(>) 7 (z) He (5) (6.13) 


where we use the notation S? to denote the S-matrix with respect to the parity basis. 


We write 
gi Dyp Vp- 
o o 


This is related to our earlier S-matrix by a change of basis. We have 


t+(r+r')/2 le 
(r’—r)/2 t—(rt+r')/2 


SP = MSM? = í 


As you may expect, this basis is particularly useful if the underlying potential is sym- 
metric, so V(x) = V(—a). In this case we have r = r’ and the S-matrix becomes 
diagonal. The diagonal components are simply 


Si,=t+r and S__=t-—-r 


In fact, because S is unitary, each of these components must be a phase. This follows 
because r and t are not independent. First, they obey |r|? + |t|? = 1. Moreover, when 
r’ =r, the relation (6.6) becomes 


rt+rt=0 = Re(rtx) =0 


= 


This is enough to ensure that both S}, and S__ are indeed phases. We write them as 
Ja = e2184 (k) and S_ = e213- (k) 


We learn that for scattering off a symmetric potential, all the information is encoded 
in two momentum-dependent phase shifts, ô+(k) which tell us how the phases of the 


outgoing waves O, are changed with respect to the ingoing waves Z4. 


6.1.4 Bound States 


So far we’ve focussed only on the scattering states of the problem. We now look at 
the bound states, which have energy Æ < 0 and are localised near inside the potential. 
Here, something rather magical happens. It turns out that the information about these 
bound states can be extracted from the S-matrix, which we constructed purely from 
knowledge of the scattering states. 


To find the bound states, we need to do something clever. We take our scattering 
solutions, which depend on momentum k € R, and extend them to the complex mo- 
mentum plane. This means that we analytically continue out solutions so that they 
depend on k € C. 


First note that the solutions with k € C still obey our original Schrödinger equation 
(6.1) since, at no point in any of our derivation did we assume that k € R. The only 
difficulty comes when we look at how the wavefunctions behave asymptotically. In 
particular, any putative solution will, in general, diverge exponentially as x — +00 
or x — —oo, rendering the wavefunction non-normalisable. However, as we will now 
show, there are certain solutions that survive. 


For simplicity, let’s assume that we have a symmetric potential V(x) = V(—z). 
As we’ve seen above, this means that there’s no mixing between the parity-even and 
parity-odd wavefunctions. We start by looking at the parity-even states. The general 
solution takes the form 
etike ae baie r= —00 


+(x) = T4 (x) + $4404(2) = i 


er | G pette £ — +00 


Suppose that we make k pure imaginary and write 


=i 
with \ > 0. Then we get 
er eS ue £ —> —0o 
v+(z) -| {ia n ie (6.14) 
ev + Sise™ £r —> +00 
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Both terms proportional to S}, decay asymptotically, but the other terms diverge. 
This is bad. However, there’s a get-out. For any fixed k (whether real or complex), 
S,4 is simply a number. That means that we’re quite at liberty to divide by it. Indeed, 
the wavefunction above isn’t normalised anyway, so dividing by a constant isn’t going 
to change anything. We get 


1 


Sre et £ > —0O0 
date) = | Se + (615) 
Boe she x£ — +00 


Now we can see the loop-hole. The wavefunction above is normalisable whenever we 
can find a A > 0 such that 


Sis(k) 400 ask—>iàì 


This, then, is the magic of the S-matrix. Poles in the complex momentum plane that 
lie on the positive imaginary axis (i.e. k = iA with A > 0) correspond to bound states. 
This information also tells us the energy of the bound state since, as we saw in (6.2), 
it is given by 

RA? 


om 


We could also have set k = —i\, with À > 0. In this case, it is the terms proportional 
to S,, in (6.14) which diverge and the wavefunction is normalisable only if S} (k = 
—i\) = 0. However, since S,, is a phase, this is guaranteed to be true whenever 
S,4(k = ià) has a pole, and simply gives us back the solution above. 


Finally, note that exactly the same arguments hold for parity-odd wavefunctions. 
There is a bound state whenever S__(k) has a pole at k = ià with A > 0. 


An Example: Stuck in the Pothole 

We can illustrate this with our favourite example of the square well, of depth —Vo and 
width a. We already computed the S-matrix in (6.11) and (6.12). We have, 
qtan(qa/2) — ik 

qtan(qa/2) + ik 

where q? = 2mVo/h? + k?. Setting k = iA, we see that this has a pole when 

_ 2mVo 

as 

These are the usual equations that you have to solve when finding parity-even bound 


S44(k) =r + t= =e ha 


A = qtan (=) with A? +g’ 


states in a square well. The form of the solutions is simplest to see if we plot these 
equations, as shown in the left-hand of Figure 33. There is always at least one bound 
state, with more appearing as the well gets deeper. 
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Figure 33: Bound state of even parity always exist, since the two equations shown on the 
left always have a solution with A,g > 0. Bound states of odd parity, shown on the right, 
exist if the potential is deep enough. 


Similarly, if we look at the parity-odd wavefunctions, we have 


ika q + ik tan(qa/2) 
e 


S__(k) =t-r= q — ik tan(qa/2) 


which has a pole at k = iA when 
2mVo 
h2 


This too reproduces the equations that we found in earlier courses in quantum mechan- 


q = —à tan (S) with += (6.16) 


ics when searching for bound states in a square well. Now there is no guarantee that a 
bound state exists; this only happens if the potential is deep enough. 
6.1.5 Resonances 


We might wonder if there’s any other information hidden in the analytic structure of 
the S-matrix. In this section, we will see that there is, although its interpretation is a 
little more subtle. 


First, the physics. Let’s think back again to the 


example shown on the right. On the one hand, we know 
that there can be no bound states in such a trap because Pie vat 
E > X 


they will have E > 0. Any particle that we place in the | 


trap will ultimately tunnel out. On the other hand, if the 

walls of the trap are very large then we might expect that Figure 34: 

the particle stays there for a long time before it eventually 

escapes. In this situation, we talk of a resonance. These are also referred to as unstable 
or metastable states. Our goal is to show how such resonances are encoded in the 
S-matrix. 
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Now, the maths. We'll restrict attention to parity-even functions. Suppose that the 
S-matrix Si, has a pole that lies on the complex momentum plane at position 


k = ko — iy 


We’d like to interpret this pole. First note that the energy is also imaginary 


with Eo = h?(k? — y”)/2m and T = 2h?yko/m. An imaginary energy may sound 
strange, but it is has a very natural interpretation. Recall that the time dependence of 
the wavefunction is given by 


eo iBt/h _ e- iEot/ħ .—Tt/2h (6.18) 


This is the first clue that we need. We see that, for y > 0, the overall form of the 
wavefunction decays exponentially with time. This is the characteristic behaviour of 
unstable states. A wavefunction that is initially supported inside the trap will be very 
small there at time much larger than 7 = 1/T. Here 7 is called the half-life of the state, 
while T is usually referred to as the width of the state. (We’ll see why in Section 6.2). 


Where does the particle go? Including the time dependence (6.18), the same argu- 
ment that led us to (6.15) now tells us that when S4}+ — oo, the solution takes the 
asymptotic form 


—ikot/h e7 ikoz —yx-Tt/2ħ 


e e 


v+(z,t) on 
H = 
me e-iEot/ħ etikox etyx—Lt/2h 


(6.19) 
x — +00 


The first two exponential factors oscillate. But the final factor varies as 


T  hko 
Qhy m 


et VF) where v= 


This has the interpretation of a particle moving with momentum Ako. This, of course, 
is the particle which has escaped the trap. 


Note that for fixed time t, these wavefunctions are not normalisable: they diverge at 
both x — too. This shouldn’t concern us, because, although our wavefunctions are 
eigenstates of the Hamiltonian, they are not interpreted as stationary states. Indeed, 
it had to be the case. An unstable state has complex energy, but standard theorems 
in linear algebra tell us that a Hermitian operator like the Hamiltonian must have real 
eigenvalues. We have managed to evade this theorem only because these wavefunctions 
are non-normalisable and so do not, strictly speaking, live in the Hilbert space. 
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There’s a lesson buried in all of this. If we were to take the standard axioms of 
quantum mechanics, we would simply throw away wavefunctions of the form (6.19) 
on the grounds that they do not lie in the Hilbert space and so are unphysical. But 
this would be a mistake: the wavefunctions do contain interesting physics, albeit of a 
slightly different variety than we are used to. Sometimes it’s worth pushing our physical 
theories beyond our comfort zone to see what is lurking there. 


The upshot of this discussion is that poles of the S-matrix in the lower-half complex 
plane correspond to resonances. It is often useful to write S}, as a function of energy 
rather than momentum. (They are related by (6.17)). Since S;4 is a phase, close to a 
resonance it necessarily takes the form 

E — Eo — il /2 
S44 = . 
E — Eo + ir /2 


The fact that the S-matrix is a phase means that any pole in the complex energy plane 


necessarily comes with a zero at the conjugate point. 


An Example: A Pair of Delta-Functions 


A pair of delta functions provide a simple and tractable example to illustrate the idea 
of resonances. The potential is given by 


V(x) =Vo [óe —1)+ô(z +1) 


Recall that the effect of the delta-functions is simply to change the boundary condi- 
tions at x = +1 when solving the Schrödinger equation. All wavefunctions should be 
continuous at x = +1, but their derivatives are discontinuous. For example, at x = +1, 
solutions obey 
2mVo 

h2 
Working in the parity basis makes life simpler, not least because you only need to 


lim [wa a= = )| =Uy(1) with Up= 


consider the matching at one of the delta-functions, with the other then guaranteed. 
The computation of the S-matrix is a problem on the exercise sheet. You will find 


(2k — iU)e** — iUpe-* 
(2k + iUg)e™* + iUper* 


Note that the denominator is the complex conjugate of the numerator, ensuring that 


Bi. = e 2k | 
Si is a phase, as expected. The poles of this S-matrix are given by solutions to the 


eth — — (1 = =<) (6.20) 


equation 
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To understand the physics behind this, let’s first look at the situation where Up —> 00, 
so that the weight of the delta-functions gets infinitely large. Then the poles sit at 


, 1 
qa > pak, (n+5)> 
These correspond to bound states trapped between Vo V 
the two wavefunctions. For example, the n = 0 state is 
shown in the figure. Note that they’re rather unusual 
because the poles sit on the real k-axis, rather than the 


imaginary k-axis. Correspondingly, these bound states 
have E > 0. This strange behaviour is only allowed be- =l H 
cause we have an infinitely large potential which forbids Figure 35: 


particles on one side of the barrier to cross to the other. 


As a side remark, we note that this same impenetrable behaviour is seen in scattering. 
When Uo — œ, the S-matrix becomes S4} — —e?"*. This tells us that a particle coming 
from outside is completely reflected off the infinitely large barrier. The minus sign is 
the standard phase change after reflection. The factor of e?* is because the waves are 
forbidden from travelling through the region between the delta functions, which has 
width x = 2. As a result, the phase is shifted by e” from what it would be if the 


barriers were removed. 


Let’s now look at what happens when Up is large, but finite? We’ll focus on the 
lowest energy bound state with n = 0. We can expand (6.20) in 1/Uo. (This too is left 
as a problem on the exercise sheet.) We find 


T 
k = — =% 
z 4 ay 


with 
T T 1 T? 1 
x i O d x —; +0 | = 
astat Olg) e gO l) 
Note, in particular, that y > 0, so the pole moves off the real axis and into the lower 


half-plane. This pole now has all the properties that we described at the beginning 
of this section. It describes a state, trapped between the two delta-functions, which 


decays with half-life 
2 
A E 
Tr hin? Uo 


This is the resonance. 
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6.2 Scattering in Three Dimensions 


Our real interest in scattering is for particles moving in three spatial dimensions, with 
Hamiltonian 


Recall that there are two distinct interpretations for such a Hamiltonian 


e We could think of this as the motion of a single particle, moving in a fixed back- 
ground potential V(r). This would be appropriate, for example, in Rutherford’s 
famous experiment where we fire an alpha particle at a gold nucleus. 


e Alternatively, We could think of this as the relative motion of two particles, 
separated by distance r, interacting through the force F = —VV(r). We could 
take V(r) to be the Coulomb force, to describe the scattering of electrons, or the 
Yukawa force to describe the scattering of neutrons. 


In this section, we will use language appropriate to the first interpretation, but every- 
thing we say holds equally well in the second. Throughout this section, we will work 
with rotationally invariant (i.e. central) potentials, so that V(r) = V(|r]). 


6.2.1 The Cross-Section 


Our first goal is to decide what we want to calculate. The simple reflection and trans- 
mission coefficients of the one-dimensional problem are no longer appropriate. We need 
to replace them by something a little more complicated. We start by thinking of the 
classical situation. 


Classical Scattering 


Suppose that we throw in a single particle with ki- 
netic energy E. Its initial trajectory is characterised 
by the impact parameter b, defined as the closest the 
particle would get to the scattering centre at r = 0 


if there were no potential. The particle emerges with 
scattering angle 0, which is the angle between the 


Figure 36: 


asymptotic incoming and outgoing trajectories, as 
shown in the figure. By solving the classical equa- 
tions of motion, we can compute 0(b; E) or, equivalently, b(6; E). 
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do 


Figure 37: What becomes of an infinitesimal cross-sectional area after scattering. 


Now consider a uniform beam of particles, each with kinetic energy E. We want to 
understand what becomes of this beam. Consider the cross-sectional area, denoted do 
in Figure 37. We write this as 


do = b dọ db 
The particles within do will evolve to the lie in a cone of solid angle dQ, given by 
dQ = sin 0 dọ d0 


where, for central potentials, the infinitesimal angles d@ are the same in both these 
formulae. The differential cross-section is defined to be 
do b |db 
dQ sino | dé 
The left-hand side should really be |do/dQ|, but we’ll usually drop the modulus. The 
differential cross-section is a function of incoming momentum k, together with the 


outgoing angle 0. 


More colloquially, the differential cross-section can be thought of as 


do dQ Number of particles scattered into dQ per unit time 


dQ ~ Number of incident particles per area do per unit time 


We write this in terms of flux, defined to be the number of particles per unit area per 
unit time. In this language, the differential cross-section is 

do Scattered flux 

dQ Incident flux 
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We can also define the total cross-section 


o= f QZ 


Both the differential cross-section and the total cross-section have units of area. The 
usual unit used in particle physics, nuclear physics and atomic physics is the barn, with 
1 barn = 107?8 m?. The total cross-section is a crude characterisation of the scattering 
power of the potential. Roughly speaking, it can be thought of as the total area of the 
incoming beam that is scattered. The differential cross-section contains more detailed 
information. 


An Example: The Hard Sphere 


Suppose that our particle bounces off a hard sphere, 
described by the potential V(r) = œo for r < R. By star- 
ing at the geometry shown in the figure, you can convince 
yourself that b = Rsina and 0 = 7 — 2a. So in this case i 


b= Rsin G- 5) Sg. 
2 2 2 


If b > R, clearly there is no scattering. The differential 


Figure 38: 


cross-section is 


do _ R? cos(0/2)sin(0/2) R? 


dQ 2 sin @ 4 


Rather unusually, in this case do /dQ is independent of both 6 and E. The total cross- 
section is 


2T +1 do 
or = i d | d(cos 0) — = t R? (6.21) 
0 3i dQ 
which, happily, coincides with the geometrical cross-section of the sphere. 


This result reinforces the interpretation of the total cross-section that we mentioned 
above; it is the area of the beam that is scattered. In general, the area of the beam 
that is scattered will depend on the energy E of the incoming particles. 


Another Example: Rutherford Scattering 


Rutherford scattering is the name given to scattering off a repulsive Coulomb potential 
of the form 


A 
V(r) =— with A>O 
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where, for two particles of charge qı and q2, we have A = qiqo/47€9. We studied 
Rutherford scattering in the lectures on Dynamics and Relativity. We found? 


8 
2bE = Acot = 
2 
This gives the differential cross-section, 


2 
1 

do = L db 2 A = (6.22) 

dQ  sinð |dé AE] sin*(0/2) 


This scattering amplitude played an important role in the history of physics. Ruther- 


ford, together with Geiger and Marsden, fired alpha particles (a helium nucleus) at 
gold foil. They discovered that the alpha particles could be deflected by a large angle, 
with the cross-section given by (6.22). Rutherford realised that this meant the positive 
charge of the atom was concentrated in a tiny, nucleus. 


There is, however, a puzzle here. Rutherford did his experiment long before the 
discovery of quantum mechanics. While his data agreed with the classical result (6.22), 
there is no reason to believe that this classical result carries over to a full quantum 
treatment. We’ll see how this pans out later in this section. 


There’s a surprise when we try to calculate the total cross-section or. We find that 
it’s infinite! This is because the Coulomb force is long range. The potential decays to 
V(r) > 0 as r > on, but it drops off very slowly. This will mean that we will have to 
be careful when applying our formalism to the Coulomb force. 


6.2.2 The Scattering Amplitude 


The language of cross-sections is also very natural when we look at scattering in quan- 
tum mechanics. As in Section 6.1, we set up the scattering problem as a solution to 
the time-independent Schrodinger equation, which now reads 


h2 
-iv + vin] w(r) = Ev(r) (6.23) 
2m 
We will send in a plane wave with energy Æ which we choose to propagate along the 
z-direction. This is just 


Wincident (T) = g 


5See equation (4.20) of the Dynamics and Relativity lecture notes, where we denoted the scattering 
angle by ¢ instead of 0. 
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where E = h?k?/2m. However, after scattering off the po- 
tential, the wave doesn’t only bounce back in the z direction. 
Instead, it spreads out spherically, albeit with a phase and 
amplitude which can vary around the sphere. It’s hard to take 
photographs of quantum wavefunctions, but the water waves 


shown on the right give a good analogy for what’s going on. 
Asymptotically, as r — oo, this scattered wave takes the form Figure 39: 


ikr 


scattered (T) = f(O, $) 


(6.24) 


r 


The 1/r fall-off follows from solving the free Schrödinger equation; we’ll see this ex- 
plicitly below. However, there is a simple intuition for this behaviour which follows 
from thinking of |7)|? as a probability, spreading over a sphere which grows as r° as 
r — oo. The 1/r fall-off ensures that this probability is conserved. Our final ansatz for 
the asymptotic wavefunction is then 


y(r) = Wincident(T) T Wscatterea (T) (6.25) 


The function f(0,¢) is called the scattering amplitude. For the central potentials con- 
sidered here it is independent of ¢, so f = f(@). It is the 3d generalisation of the 
reflection and transmission coefficients that we met in the previous section. Our goal 
is to calculate it. 


The scattering amplitude is very closely related to the differential cross-section. To 
see this, we can look at the probability current 


h * * 
J= i (u've - (Vor) 
which obeys V - J = 0. For the incident wave, we have 


hk 


A 


Jincident = —Z 
m 


This is interpreted as a beam of particles with velocity v = hk/m travelling in the 
z-direction. Meanwhile, the for the scattered wave we use the fact that 


ikf(0) e . 1 
V Yscattered = ai + O (= 


to find 


hk 1 A 1 
Jscattered = —— |0) +0 (=) 
mr r 
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This means that, as r — oo, the flux of outgoing particles crossing an area dA subtended 
by the solid angle dQ 


7 ħk 
J scattered -tdA= m MOK dQ 


The differential cross-section is defined to be the ratio of the scattered flux through dQ, 
divided by the incident flux. In other words, it is 


do _ hk\| f(0)|?/m 
a him 


= |f (0)? 


This is rather nice. It means that if we can compute the scattering amplitude f (0), it 
immediately tells us the differential cross-section. The total cross-section is defined, as 
before, as 


= fa KO 


6.2.3 Partial Waves 


To make progress, we need to start to look in a more detail at the solutions to the 
Schrödinger equation (6.23). Because we’ve decided to work with rotationally invariant 
potentials, it makes sense to label our wavefunctions by their angular momentum, l. 
Let’s quickly review what this looks like. 


A general wavefunction w(r,0,@) can be expanded in terms of spherical harmonics. 
In this section, however, we only need to deal with wavefunctions of the form y(r, @), 
which are independent of ¢. Such functions have an expansion in terms of partial waves 


wir, 0) = D Ri(r) P,(cos 8) 


Here the P,(cos 0) are Legendre polynomials. They appear by virtue of being eigenstates 
of the angular momentum operator L?, 


L? P,(cos 0) = h7l(1 + 1) P;(cos 0) 


In more concrete terms, this is the statement that the Legendre polynomials P(w) 
obey the differential equation 
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Meanwhile, the original Schrödinger equation (6.23) becomes an ordinary differential 
equation for the radial functions R;, 
( d? 2d I(l+1) 


T 
dr? rdr r2 


U(r) + e) Rı(r)=0 (6.26) 


where we’ve used the expression for the energy, E = h?k?/2m, and rescaled the potential 


Spherical Waves when U(r) = 0 


We will assume that our potential drops off sufficiently quickly so that asymptotically 
our waves obey (6.26) with U(r) = 0. (We will be more precise about how fast U(r) 
must fall off later.) We can write the equation obeyed by R; as 


(= (L+ 1) 


dr? r2 


+ e?) (rRi(r))=0 (6.27) 


There are two s-wave solutions with l = 0, given by 


+ikr 


Ro(r) = (6.28) 


These are ingoing (minus sign) and outgoing (plus sign) spherical waves. 


The solutions for / 4 0 are more known as spherical Bessel functions and are described 
below. 


Plane Waves when U(r)=0 


Of course, when U = 0, the plane wave 


Wincident (T) = eŻ¥? = etk" cos 8 
is also a solution to the Schrödinger equation. Although it feels rather unnatural, it 
must be possible to expand these solutions in terms of the spherical waves. To do this, 
it is convenient to briefly introduce the coordinate p = kr. We write the plane wave 
solution as 


Wincident (P, 0) = as = y2 + 1)u(p)P;(cos 0) (6.29) 
l 


where the factor of (2l + 1) is for convenience and the function w (p) are what we want 
to determine. The Legendre polynomials have a nice orthogonality property, 
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We can use this to write 
+1 

wulp) = J dw e’’” P(w) (6.31) 
= 
Our interest is only in the behaviour of the plane wave as p > oo. To extract this, we 
start by integrating by parts 

reena _ 1 fa aie OP 
2 ip 


w= aah m 


The Legendre polynomials obey P,(1) = 1 and P,(—1) = (—1)!. We then find 
1 , 1 
ui(p) [e — (—1)'e™*’] +0 (=) (6.32) 


= 2ip 
where a further integration by parts will convince you that the remaining terms do 
indeed drop off as 1/p?. This is the result we need. As r — oo, the incident plane wave 
can be written as 


OO 


241 fe" err 
Wincident = >. dik | m = (—1)' m | P,(cos 0) (6.33) 
l=0 


We learn that the ingoing plane wave decomposes into an outgoing spherical wave (the 
first term) together with an ingoing spherical wave (the second term). 


Phase Shifts 


It’s been quite a long build up, but we now know what we want to calculate, and how 
to do it! To recapitulate, we’d like to calculate the scattering amplitude f(@) by finding 
solutions of the asymptotic form 

ikr 


ple) = e" + f(0)— 


We still have a couple more definitions to make. First, we expand the scattering 


as r —> œO 


amplitude in partial waves as 


oO 


10=> at LER (6.34) 


The normalisation coefficients of 1/k and (2l+1) mean that the coefficients fı sit nicely 
with the expansion (6.33) of the plane wave in terms of spherical waves. We can then 
write the asymptotic form of the wavefunction as a sum of ingoing and outgoing waves 


leo) —ikr ikr 
u(r) De [ D Si 2% fi)— P,(cos 8) (6.35) 


r 
l=0 


where the first term is ingoing, and the second term is outgoing. For a given potential 
V(r), we would like to compute the coefficients f; which, in general, are functions of k. 
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Note that the problem has decomposed into decoupled angular momentum sectors, 
labelled by | = 0,1,.... This is because we’re working with a rotationally symmetric 
potential, which scatters an incoming wave, but does not change its angular momentum. 
Moreover, for each l, our ansatz consists of an ingoing wave, together with an outgoing 
wave. This is entirely analogous to our 1d solutions (6.9) when we first introduced 
the S-matrix. We identify the coefficients of the outgoing terms as the elements of the 
S-matrix. For rotationally invariant potentials, the 3d S-matrix S is diagonal in the 
angular momentum basis, with elements given by 


S.=1+2if, with 1=0,1,2,... 


Now unitarity of the S-matrix — which is equivalent to conservation of particle number 
— requires that these diagonal elements are a pure phase. We write 


1 . f 
—(e”® — 1) = et’ sin ô 


Si = e2 => fi a 
2i 


where ô; are the phase shifts. Comparing back to (6.34), we see that the phase shifts 
and scattering amplitude are related by 
o 1 


= zi 2 (21 + 1) (e — 1) P,(cos 0) 
1=0 


The picture that we have is entirely analogous to the 1d situation. A wave comes in, 
and a wave goes out. Conservation of probability ensures that the amplitudes of these 
waves are the same. All information about scattering is encoded in the phase shifts 
6.(k) between the ingoing and outgoing waves. 


6.2.4 The Optical Theorem 


The differential cross-section is da /dQ = |f(@)|?. Using the partial wave decomposition 
(6.34), we have 


do _1 =) (2 + 1)(2l' +1) fi ff Pr(cos 0) Py (cos 6) 


In computing the total cross-section or, we can use the orthogonality of Legendre 
polynomials (6.30) to write 


29 fia pe = 2 a+ yar = gl eet) 25, (6.36) 
OT = 47 - COS da 2 - l sin l : 
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We can compare this to our expansion (6.34). Using the fact that P(1) = 1, we have 


2+1, 
f0) = Ss ei sin ô 


l 
This tells us that the total cross-section is given by 


OT = = tm4(0) 


This is known as the optical theorem. 


Here’s some words that will hopefully build some intuition for the optical theorem. 
The potential causes scattering from the forward direction (0 = 0) to other directions. 
Because total probability is conserved, clearly the amount of particles going in the 
forward direction must decrease. However, this decrease in the forward direction must 
be equal to the total increase in other directions — and this is what the total cross- 
section or measures. Finally, the amount of decrease in forward scattering is due to 
interference between the incoming wave and outgoing waves, and so is proportional to 


f (0). 
Unitarity Bounds 


If we think of the total cross-section as built from the cross-sections for each partial 
wave then, from (6.36), we have 


At 


OO 
or=) o with ot a 


1=0 


(21 + 1) sin? ô, (6.37) 


Clearly each contribution is bounded as o; < 47(21 + 1)/k?, with the maximum arising 
when the phase shift is given by 6; = +7/2. This is called the unitarity bound. 


There’s a straightforward, semi-classical way to understand these unitarity bounds. If 
we send in a particle with momentum Ak and impact parameter b, then it has angular 
momentum L = hkb. This angular momentum is quantised. Roughly speaking, we 
might expect that the particle has angular momentum hl, with | € Z, when the impact 
parameter lies in the window 


l <b< eed +d 
k% ~ k 
If the particle gets scattered with 100% probability when it lies in this ring, then the 
cross-section is equal to the area of the ring. This is 
(+1)? Par (2l+1)r 


k2 k? k? 


(6.38) 
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This is almost the unitarity bound (6.37). It differs by a factor 4. As we will now see, 
that same factor of 4 difference often arises between simple classical arguments and a 
full quantum treatment of scattering processes. 


6.2.5 An Example: A Hard Sphere and Spherical Bessel Functions 


After all this formalism, let’s finally do an example. Our scattering region will be a 
hard sphere of radius a, with potential 


vo- r<a 


0 r>a 


Since the wavefunction vanishes inside the sphere and is continuous, this potential is 
equivalent to imposing the boundary condition ~(a) = 0. 


For r > a, the wavefunction can be decomposed in partial waves 
= 2, Fal Ri(r) Py(cos 8) 


where the radial wavefunction obeys the free Schrodinger equation 


@ i(l+1) 
— H1 R =0 6.39 
(E-E +1) corny (6.39) 
where we’re again using the coordinate p = kr. Solutions R;(p) to this equation 


are known as spherical Bessel functions and are denoted j;(~) and nj(p). They are 
important enough that we take some time to describe their properties. 
An Aside: Spherical Bessel Functions 


The solutions to (6.39) are given by spherical Bessel functions, Rilo) = ji(p) and 
Ri(p) = n(p), and can be written asf 


jnlp) = (0)! A Te and mip) =-=) Ce 


Note that jo() = sin p/p and no(p) = — cos p/p, so the solutions (6.28) for free spherical 
waves can be written as Ro(p) = no(p) + ino(p). 


®Proofs of this statement, together with the asymptotic expansions given below, can be found in 
the handout http://www.damtp.cam.ac.uk/user/tong/aqm/bessel.pdf. 
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In what follows, it will be useful to have the asymptotic form of jı and nı. They are 

given by 
1 
cos(p — sli 

and nji(p) > i ia a as p —> œ (6.40) 
p p 
We see that at large r, the spherical Bessel functions look more or less the same for all 
l, differing only by a phase. In particular, the combinations jı + n; look essentially the 
same as the | = 0 spherical waves that we met in (6.28). However, the spherical Bessel 
functions differ as we come in towards the origin. In particular, close to p = 0 we have 


l 


Alp) > / 4 


CI+! and ni(p) = — (2l = 1)! pt) as p —> 0 (6.41) 


where (21+ 1)!! =1-3-5----(21+1) is the product of all odd numbers up to 2l + 1. 
Note that j,(p) is regular near the origin, while n; diverges. 


Before we proceed, it’s worth seeing how we write the plane wave e‘** in terms of 
spherical Bessel functions. We wrote the partial wave expansion (6.29) in terms of 
functions u;(p), whose asymptotic expansion was given in (6.32). This can be rewritten 
as 


F sin(p — 5l7) 
p 


w(p) > 
which tells us that we can identify the functions u;(p) as 


u(p) = jp) 
Back to the Hard Sphere 


Returning to our hard sphere, the general solution for r > a can be written in the form, 
Ri(r) = A; | cos ay ji(p) — sin œ (p) (6.42) 


where, as before, p = kr. Here A; and q; are two integration constants which we will 
fix by the boundary condition. Because the Schrodinger equation is linear, nothing 
fixes the overall coefficient A;. In contrast, the integration constant a; will be fixed 
by the boundary conditions at r = a. Moreover, this integration constant turns out 
to be precisely the phase shift ô that we want to compute. To see this, we use the 
asymptotic form of the spherical Bessel functions (6.40) to find 


1 1 1 1 1 
Ri(r) ~ = |cosq; sin(p — =I) + sin q cos(p — =lr)| = —sin(p — -lr + a) 
p 2 2 p 2 


=i = 


We can compare this to the expected asymptotic form (6.35) of the wavefunction 


emir ; et? e251 eit! /2 
R,(r) oe (=) 4 e2 | 2 ; 


|- eilti =rl/2) | gi(p+i—nl/2) 
P P 


to see that, as a function of p = kr, the two expressions agree provided 
a = 01 


In other words, if we can figure out the integration constant a; then we’ve found our 
sought-after phase shift. 


The boundary condition imposed by the hard sphere is simply R;(a) = 0. This tells 
us that 


j j (ka 
cos ô; ji(ka) = sindm(ka) = tand, = we 


This is the final result for this system. Now let’s try to extract some physics from it. 
First note that for the l = 0 s-wave, the phase shift is given by exactly by 
do = —ka 


For small momenta, ka < 1, we can extract the behaviour of the higher / phase shifts 
from p — 0 behaviour of the spherical Bessel functions (6.41). We have 


(ka)?! 
(21+ 1)! (21 — 1! 


6) & — 


We see that for low momentum the phase shifts decrease as l increases. This is to 
be expected: the higher l modes have to penetrate the repulsive angular momentum 
~ Al(1+1)/r?. Classically, this would prohibit the low-momentum modes from reaching 
the sphere. Quantum mechanically, only the exponential tails of these modes reach 
r =a which is why their scattering is suppressed. 


For low momentum ka < 1, we now have all the information we need to compute 
the total cross-section. The sum (6.36) is dominated by the | = 0 s-wave, and given by 


or = 4ra’ (1 + O ((ka)*) ) 


This is a factor of 4 bigger than the classical, geometric result (6.21) 
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It’s also possible to extract analytic results for the phase shifts at high momentum 
ka >> 1. For this we need further properties of the spherical Bessel functions. Here 
we simply state the results. The phase shifts 6; vary between 0 and 27 for l < ka. 
However, when | < ka, the phase shifts quickly drop to zero. The intuition behind this 
follows from the semi-classical analysis (6.38) which tells us that for | >> ka, the impact 
parameter is b >> a. This makes it unsurprising that no scattering takes place in this 
regime. It turns out that as ka — oo, the total cross-section becomes or > 27’. 


The Scattering Length 


The low-momentum behaviour 6, ~ (ka)”+! that we saw is common to all scattering 
potentials. It means that low-energy scattering is always dominated by the s-wave 
whose phase shift scales as 


dy ~ —kas + O(k°) (6.43) 


The coefficients a, is called the scattering length. As we have seen, for the hard sphere 
a, = a, the radius of the sphere. At low energies, the total cross-section is always given 
by 


or S To ~ Ara? 


The scattering length is a useful way to characterise the low-energy behaviour of a 
potential. As we will see in examples below, a, can be positive or negative and can, at 
times, diverge. 


6.2.6 Bound States 


In this section we describe the effects of bound states on scattering. Such states only 
occur for attractive potentials, so we again take a sphere of radius a, but this time with 


potential 
-Vvo < 
Viry=e 2 rha (6.44) 
0 r>a 
It will be useful to define the following notation 
2mV (r 2mMVo 
UG) = L and y= 72 g (6.45) 


We'll start by focussing on the l = 0 s-wave. Outside the sphere, the wavefunction 
satisfies the usual free Schrödinger equation (6.27) 


d 
(Z+) (ry) =0 r>a 


= 2 i= 


with general solution 


_ Asin(kr + ðo) 


r 


vr) 


The same argument that we made when discussing the hard sphere shows that the 
integration constant ôo is the phase shift that we want to calculate. We do so by 


r>a (6.46) 


matching the solution to the wavefunction inside the sphere, which satisfies 


d 
(trea) (ry)=0 r<a 


The requirement that the wavefunction is regular at the origin r = 0 picks the solution 
inside the sphere to be 


y(r) = a Ga r<a (6.47) 
r 
The solutions (6.46) and (6.47) must be patched at r = a by requiring that both 
v(a) and Y'(a) are continuous. We get the answer quickest if we combine these two 
and insist that w’/w is continuous at r = a, since this condition does not depend on 
the uninteresting integration constants A and B. A quick calculation shows that it is 
satisfied when 


tan(ka + ôo) = tan(./k? + 72a) (6.48) 


For very high momentum scattering, k? >> 77, we have ôo > 0. This is to be expected: 
y 


the energy of the particle is so large that it doesn’t much care for the small, puny 
potential and there is no scattering. 

Bound States and the Scattering Length 

Things are more interesting at low energies, k? « 7? and ka < 1. We have 


tan(ka + 69) _ tan(ya) tan(ka) + tan(ðo) _ k 
ka ~ ya = tan(ka)tan(do) 7 tanya) 


Rearranging, we get 
tan ĝo = ka (= — i) + O(k*) (6.49) 


If the phase shift do is small, then we can write tan do œ~ ôo and, from (6.43), read off 
the scattering length 


tan(ya) 
Y 


da= q 


(6.50) 
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Note that, for this approximation to hold, we need ka, < 1, but the scattering length a, 
exhibits somewhat surprising behaviour. For small y, the scattering length is negative. 
This can be thought of as due to the attractive nature of the potential, which pulls the 
particle into the scattering region rather than repelling it. However, as y is increased, 
the scattering length diverges to —oo, before reappearing at +oo. It continues this 
pattern, oscillating between +oo and —oo. Our task is to understand why this striking 
behaviour is happening. 


Before we proceed, note that all the calculations above also hold for repulsive poten- 
tials with Vo < 0. In this case y, defined in (6.45) is pure imaginary and the scattering 
length (6.50) becomes 

tanh(|y|a 
nna BAND yo) 
I 
Now the scattering length is always positive. It increases monotonically from a, = 0 
when y = 0, corresponding to no scattering, through to a, = a when |y| — oo, which 
is our previous result for the hard-sphere. We see that whatever is causing the strange 
oscillations in (6.50) does not occur for the repulsive potential. 


The key to the divergent behaviour of the scattering length lies in the bound states 
of the theory. It’s a simple matter to construct | = 0 bound states. We solve the 
Schrodinger equation with the form 


E a a r<a 


Be-*" r>a 


The two solutions have the same energy E = —h?\?/2m. Matching the logarithmic 
derivatives across r = a gives 


tan( y7? — 2a) = a * (6.51) 
This structure of the solutions is similar to what we saw in Section 6.1.4. Indeed, if 
we write q? = y? — 47, then these equations take the same form as (6.16) that describe 
odd-parity states in one-dimension. In particular, this means that if the potential is 
too shallow then no bound states exist. As y gets larger, and the potential gets deeper, 
bound states start to appear. They first arise when A = 0 and tan(ya) = œo, so that 


1 
y=n=(n+5)2 with n=0,1,... 
2) a 
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This coincides with the values for which the scattering length (6.50) diverges. For y 
slightly less than y,, the bound state has not yet appeared and the scattering length 
is very large and negative. For y slightly greater than 7,, the new state exists and is 
weakly bound, and the scattering length is large and positive. Meanwhile, when y = Jx, 
then there is a bound state which has energy E = 0. Such bound states are said to be 
“at threshold’ . 


The incoming wave has energy slightly above E = 0 and mixes strongly with the 
bound state — or almost bound state — with energy a little below Æ = 0. This is what 
gives rise to the divergence in the cross-section. Specifically, when there is a bound 
state exactly at threshold, tan dy — oo and so the phase shift is dj = (n + 3)m. (Note 
that at this point, we can no longer write dg ~ —ka, because a, this is valid only for 
kas < 1, but a, is diverging.) The s-wave cross-section saturates the unitarity bound 
(6.37) 


To understand why the formation of bound states gives rise to a divergent scattering 
length, we can look at the analytic structure of the S-matrix at finite k. We know from 
(6.48) that the phase shift is given by 


k 
tan(ka + 69) = ——= tan(y k? + 72a) = f(k) 
ke oP 
Rearranging, we get the s-wave component of the S-matrix 
etka 1 $ if (k) 
1—if(k) 
The S-matrix has a pole at f(k) = —i, or for values of k such that 
/j2 2 
tan(./k? + 72a) = a (6.52) 
This has no solutions for real k. However, it does have solutions along the positive 


imaginary k axis. If we set k = iA, the equation (6.52) coincides with the condition for 
bound states (6.51). 


So(k) =e = 


Close to the pole, the S-matrix takes the form 
LA +k 
S-k) = 2iðo — 1 
ea la =" 
When the bound state approaches threshold, À is small and this form is valid in the 
region k = 0. For k < X, we can expand in k/, to find ĝo ~ —k/A, which tells us that 
we should indeed expect to see a divergent scattering length a, = 1/.. 
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Figure 40: The cross-section for neutron scattering off U-235. 


When neutrons scatter off large nuclei at low-energies they are very close to forming 
a threshold bound state. The total cross-section for neutron scattering off uranium 235 
is shown in the figure’. You can see the large enhancement of the cross-section. This 
is partly due to the bound state, although it is complicated by the presence of a large 
number of resonances whose effects we’ll discuss in the next section. 


6.2.7 Resonances 


We already met the idea of resonances in Section 6.1.5. These are unstable bound 
states, which appear as poles of the S-matrix in the lower-half complex plane. Here we 
see how these resonances affect scattering in 3d. 


It’s not hard to construct examples which exhibit resonances. Indeed, the attractive, 
spherical potential (6.44) which has bound states also exhibits resonances. These don’t 
occur for s-waves, but only for higher l, where the effective potential includes an effec- 
tive, repulsive angular momentum barrier. The algebra is not conceptually any more 
difficult than what we did above, but in practice rapidly becomes a blur of spherical 
Bessel functions. 


Alternatively, we could look at the somewhat simpler example of a delta-function 
cage of the form V(r) = Vod(r — a), which is the obvious 3d generalisation of the 
example we looked at in Section 6.1.5 and has s-wave resonances. 


Rather than getting bogged down in any of these details, here we focus on the features 
that are common to all these examples. In each case, the S-matrix has a pole. Thinking 
in terms of energy FE = h?k?/2m, these poles occur at 

iT 
Beky 
2 


The data is taken from the Los Alamos on-line nuclear information tour. 
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Figure 41: Distribution with T? = 2... Figure 42: ...and with r? = 15 


This is the same result (6.17) that we saw in our ld example. Close to the pole, the 
S-matrix — which, by unitarity, is simply a phase — must take the form 


P=fy= 2 
E—Ej)+il/2 


S(E) = e) = eX) (6.53) 


i(F) is the so-called continuum contribution; it is due to the usual, run-of-the- 


Here e 
mill phase shift that arises from scattering off the potential. Here our interest is in 
the contributions that come specifically from the resonance, so we'll set 6 = 0. From 
(6.53), we have 

(E — Ey)? —T?/4 T? 


26 = in? ô = 
cos 26 (E-m) +12/4 => sin IE- m) 4I? 


From this we can read off the contribution to the total cross-section using (6.36). If 
the pole occurs for a partial wave with angular momentum /, we have 


At T? 
N 21+1 
k2 ( TE — Ey)? +T? 


OT 


This distribution is plotted in the figure, with Ey = 4 and T? = 2 and 15. ( Remember 
that there is an extra factor of E sitting in the k? in the formula above). It is called the 
Breit-Wigner distribution, or sometimes the Lorentzian distribution (although, strictly 
speaking, neither of these has the extra factor of 1/k?). It exhibits a clear peak at 
E = Eo, whose width is given by ['/2. Comparing to our discussion in Section 6.1.5, 
we see that the lifetime of the resonance can be read off from the width of the peak: 
the narrower the peak, the longer lived the resonance. 


The Breit-Wigner distribution is something of an iconic image in particle physics 
because this is the way that we discover new particles. To explain this fully would 
require us to move to the framework of quantum field theory, but we can get a sense 
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Figure 43: The cross-section for the Z- Figure 44: And for the Higgs boson. 
boson. 


for what’s going on from what we’ve seen above. The key fact is that most particles 
in Nature are not stable. The exceptions are the electron, the proton, neutrinos and 
photons. All other decay with some lifetime 7. When we collide known particles — 
typically electrons or protons — we can create new particles which, since they are 
unstable, show up as resonances. The energy Eo corresponds to the mass of the new 
particle through Ey = mc’, while the lifetime is seen in the width, 7 = 1/T. 


Two examples are shown in the figures. The left-hand figure shows the cross-section, 
now measured in pico-barns = 10~*° m?, for high-energy electron-positron scattering. 
We see a large resonance peak which sits at a centre of mass energy Eg ~ 91 GeV 
with width I ~ 2.5 GeV. Since we’re measuring the width in unit of energy, we need 
a factor of h to convert to the lifetime 


ch 
TT 


Using A = 6.6 x 10716 eV, we find the lifetime of the Z-boson to be T & 3 x 107% s. 


The right-hand figure shows the 2012 data from the discovery of the Higgs boson, 
with mass Ey ~ 125 GeV. I should confess that the experiment doesn’t have the 
resolution to show the Breit-Wigner shape in this case. The best that can be extracted 
from this plot is a bound on the width of T < 17 MeV or so, while the true width is 
predicted by theory to be T ~ 4 MeV. 


6.3 The Lippmann-Schwinger Equation 


So far, we’ve developed the machinery necessary to compute cross-sections, but our 
examples have been rather artificial. The interactions between particles do not look 
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like spherical potential wells or shells of delta-functions. Instead, they are smooth po- 
tentials V(r), such as the Coulomb or Yukawa potentials. We would like to understand 
scattering in these more realistic settings. 


In principle, this is straightforward: you simply need to solve the relevant Schrodinger 
equation, impose regularity at the origin, and then read off the appropriate phase shifts 
asymptotically. In practice, the solution to the Schrodinger equation is rarely known 
analytically. (A counterexample to this is the Coulomb potential which will be discussed 
in Section 6.4.) In this section, we present a different approach to scattering that makes 
use of Green’s functions. This provides a platform to develop a perturbative approach 
to understanding scattering for potentials that we actually care about. Moreover, these 
Green’s functions methods also have applications in other areas of physics. 


Our starting point is the Schrodinger equation 
h? 
-Zv + vin] wr) = Ew(r) (6.54) 


We'll briefly use a more formal description of this equation, in order to write the 
Lippmann-Schwinger equation in its most general form. We’ll then revert back to the 
form (6.54) which, for the purposes of these lectures, is all we really care about. With 
this in mind, we write the Schrodinger equation as 


(Ho + V) |b) = Ely) 


The idea here is that we’ve split the Hamiltonian up into a piece that is simple to 
solve — in this case Hyp = —h?V?/2m — and a more complicated piece, V. Trivially 
re-arranging this equation gives 


(E — Ho)|v) = Vv) (6.55) 
We can then formally re-arrange this equation once more to become 
1 
= 6.56 
W) = 18) + ge) (6.56) 


Here |) is a zero mode which obeys Ho|¢) = E|@). If (6.56) is multiplied by Æ — Ho 
then the state |@) is annihilated and we get back to (6.55). However, the inverse 


! is somewhat subtle and, as we will see below, there is 


quantum operator (E — Ho) 
very often an ambiguity in its definition. This ambiguity is resolved by writing this 
inverse operator as (Æ — Hy + ie)~+, and subsequently taking the limit « — 0+. We 


then write 


Iv) = |Ø) 4 Viy) (6.57) 
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This is the Lippmann-Schwinger equation. It is not really a solution to the Schrodinger 
equation (6.54) since |y} appears on both sides. It is more a rewriting of the Schrödinger 
equation, but one which gives us a new way to move forward. 


The Green’s Function 


Let’s now write down the Lippmann-Schwinger equation for our Schrödinger equation 
(6.54). We want the inverse operator (E — Ho)~'. But this is precisely what we call 
the Green’s function Gp. It obeys 


h? 2 / / 
(B+ ÈV?) GolBin.e)) =6(0-2) 


The formulae will be somewhat simpler if we scale out the factor h?/2m. We write 


hk? 
F 2m 
so that 
1 m t 
(V? +k?) Golk; r,r’) = pe =r’) (6.58) 


We can solve for this Green’s function using the Fourier transform. First, we note that 
translational invariance ensures that Go(k;r,r’) = Go(k;r — r’). Then we define the 
Fourier transform 


G= / Perak + Go(kix) = f 1 ax olkia) 


Plugging this into our formula (6.58), we have 
> 2M ~ _ 2m = 
(~? +k°)G(k;a)= => > Golkia) = Belk 


So it’s simple to get the Green’s function in momentum space. Now we must invert it. 


We have 
_ 2m e’4 x 
Go — 
(k; =| oe oq? — k? 


Here we run into the ambiguity that we ae above. When we do the integral 


over q, we run into a singularity whenever q? = k?. To define the integral, when we 
integrate over q = |q|, we should define a contour in the complex q plane which skips 
around the pole. We do this through the so-called “ie prescription” which, as the name 
suggests, replaces the integral with 


_ 2m etx 
o (k: 
Gove) mf ot 3 q? — k? — ic 


Where we subsequently take e + 0*. This E the pole slightly off the real q axis. 
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The simplest way to do this integral is to go to polar coordinates for the q variable. 


We have 
Im 1 27T +1 oo q etaz cos 0 


q? — k? — ic 
o m 1 [ Cfo =e" 
~ E (Qr) ú 


_ m 1 fa qet” 

AP (Qn)2 in J. É (q—k—ie)\(q+k + ie) 
where we’re allowed to factorise the denominator in 
this way, with k > 0, only because we’re ultimately 5 is 
taking « + 0+. We can now complete the derivation 


by contour integral. Since x > 0, we can complete the 


contour in the upper half-plane, picking up the residue _| krie @ 
from the pole at q = k+ie. This gives our final answer, os 
2m 1 etikle-r'] Figure 45: 


Gi (k;r — r’) = 


— 6.59 

R 4r |r-r'| gee) 
Note that had we chosen to add +7e rather than —ze to the denominator, we would 
find the alternative Green’s function Gj (k; x) ~ e~*"/4rx. We will justify the choice 
of Gj below. 


Our Lippmann-Schwinger Equation 


To finally write down the Lippmann-Schwinger equation, we need to determine the 
state |¢) which is annihilated by Æ — Ho. But, for us, this is simply the plane wave 
solution 


o(r) =e 
We can now write the formal Lippmann-Schwinger equation (6.57) in more concrete 
form. It becomes 


w(kjr) = eT — = per 


etik|r—r'| ; ; 

TE V(rylk; r’) (6.60) 
It is simple to check that acting on this equation with the operator (V? + k?) indeed 
brings us back to the original Schrödinger equation (6.54). The Lippmann-Schwinger 
equation is an integral equation, a reformulation of the more familiar Schrödinger dif- 
ferential equation. It is not solution to the Schrödinger equation because we still have 
to figure out what w is. We’ll offer a strategy for doing this in Section 6.3.1. 
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The equation (6.60) has a very natural interpretation. The first term is simply the 
ingoing wave with momentum hk. The second term is the scattered wave. Note that 
the factor e“*!*T'l tells us that this wave is moving outwards from the point r’. Had we 
instead chosen the Green’s function Gg, we would have found a wave moving inwards 


—ik|r—r’| 


from infinity of the form e . This is unphysical. This is the reason that we pick 


the —ie prescription rather than +e. 


To make contact with our earlier discussion of scattering, we look at the asymptotic 
form of this outgoing wave at r — oo. For this to work, we’ll assume that V(r’) has 
support only in some finite region. We can then take the limit r >> r’ and expand 


r-r 


Ir = r'| = vr? = 2x- r' +r? xr 


With V(r’) localised within some region, it makes sense to perform this expansion inside 


the integral. In this approximation the Green’s function (6.59) can be written as 


2m 1 et ak 
Go (k;r— r’) aja e ikî-r 


and the Lippmann-Schwinger equation then becomes 
2m 1 ata 
= J dr’ eV (r'jh(k; r’) 


. ik-r 
Vine = Rl 


Although we derived this by assuming that V(r) has compact support, we can actually 
be a little more relaxed about this. The same result holds if we require that V(r’) > 0 
suitably quickly as r’ > oo. Any potential which falls off exponentially, or as a power- 
law V(r) ~ 1/r" with n > 2, can be treated in this way. Note, however, that this 
excludes the Coulomb potential. We will deal with this separately in Section 6.4. 


If we set the ingoing wave to be along the z-axis, k = kz, then this takes the 
asymptotic form (6.25) that we discussed previously 


etkr 


vir) ~ e+ f0, 9) (6.61) 
The upshot of this analysis is that we identify the scattering amplitude as 
__ 2ml 3,0 p ike’ grn E 
(6,0) = -Te f dr! VEn’) 


where 0 and ¢ are the usual polar angles such that fr = (sin 6 cos ¢, sin 0 sin œ, cos 0). 
This gives a simple way to compute the scattering amplitude, but only if we already 
know the form of the wavefunction y(r’) in the scattering region where V(r’) 4 0. Our 
next task is to figure out how to compute y(r’). 
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An Equation for Bound States 


Above we've focussed on scattering states with energy E = h?k?/2m > 0. However, 
it is not difficult to repeat everything for bound states with energy E = —h?\?/2m. 
Indeed, in this case there is no ambiguity in the definition of the Green’s function. We 
find that bound states must obey the integral equation 


m er r'| 
-Tr fer væl’) 


" Tae r] 


We won’t attempt to solve this equation; instead our interest will focus on the Lippmann- 
Schwinger equation for scattering states (6.60). 


6.3.1 The Born Approximation 

In this section we describe a perturbative solution to the Lippmann-Schwinger equation, 
w(kjr) = eT + per Go (kyr — r) V(r')o(k;r’) (6.62) 

This solution is known as the Born series. 


We write w as a series expansion 


r) = do olr) (6.63) 


where we take the leading term to be the plane wave 
olr) = e* 


This series solves (6.62) if the ¢, obey the recursion relation 


nae f dr! OF (kir — r^) V(t) bale’) 


We will not be very precise here about the convergent properties of this series. Roughly 
speaking, things will work nicely if the potential V is small, so each successive term is 
smaller than those preceding it. 


The Born approximation consists of taking just the leading order term @¢, in this 
expansion. (Strictly speaking this is the first Born approximation; the nt Born ap- 
proximation consists of truncating the series at the nt? term.) This is 


w(r) = pier a. 2m 1 J Ër jar vir) eikr (6.64) 


r 
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where 
q=k-—kr 


can be thought of as the momentum transferred from the incoming wave to the outgoing 
wave. With this in mind, it’s traditional to define the momentum of the outgoing wave 
to be 

k = kf 


so that q = k — k’. Comparing the Born approximation (6.64) to the asymptotic form 
(6.61), we see that the scattering amplitude is simply the Fourier transform of the 


potential, 
FO, p) = fol, = -2E S Br ear v) =- Va) 
ee ae ee E Saza 4 
Note that the scattering amplitude is a function of 0 and 4, k 


but these variables are somewhat hidden on the notation of the 

right-hand side. They’re sitting in the definition of q, with 

k-k’ = k?cos6, and the variable ¢ determining the relative 

orientation as shown in the figure. As we’ve seen before, for a 7 
central potential V(r) = V(r), the resulting scattering ampli- 

tude will be independent of ¢. Because the angular variables Figure 46: 

are somewhat disguised, the scattering amplitude is sometimes 

written as f(k,k’) instead of f(0,¢). Indeed, we’ll adopt this notation in Section 6.5. 


Finally, the cross-section in the Born approximation is simply 
m 


d : 
E © lfol = (a) Pr (6.65) 


There’s some physics in this simple formula. Suppose that your potential has some 
short-distance structure on scales ~ L. Then the Fourier transform V (q) is only sensi- 
tive to this when the momentum transfer is of order q ~ 1/L. This is a manifestation 
of the uncertainty principle: if you want to probe short distance physics, you need high 
momentum transfer. 


6.3.2 The Yukawa Potential and the Coulomb Potential 


At long distances, the strong nuclear force between, say, a proton and a neutron is well 
modelled by the Yukawa potential 


Ae HT 
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Figure 47: The cross-section for the Figure 48: ...and for the Coulomb poten- 
Yukawa potential... tial. 


where 1/j is said to be the range of the force. We can compute the Fourier transform 
using the same kind of contour methods that we used in the previous section. We have 


~ 4TA 
V — 
(= ay 


Writing this in terms of the scattering angle 0, we recall that q = k — k’ with k’ = kr, 
so that 


q? = 2k? — 2k-k’ = 2k?(1 — cos 0) = 4k? sin? (0/2) 


If we translate from momentum k to energy E = h?k?/2m, then from (6.65), we have 
the leading order contribution to the cross-section for the Yukawa potential given by 


do _ ( 2Am (6:68) 


2 
dQ hep? + 8m as} 
This is shown in the left-hand figure (for values A = m = hu = 1 and E = 1/4). 


An Attempt at Rutherford Scattering 


It’s tempting to look at what happens when u —> 0, so that the Yukawa force becomes 
the Coulomb force. For example, for electron-electron or proton-proton scattering, the 
strength of the Coulomb force is A = e?/47€9. In this case, the cross-section (6.66) 
becomes, 


7 7 (4) UE (6.67) 


This is shown in the right-hand figure (with the same values). Note that there is an 
enhancement of the cross-section at all scattering angles, but a divergence at forward 
scattering. 
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Rather remarkably, the quantum result (6.67) agrees with the classical cross-section 
that we found in (6.22)! This is a surprise and is special to the Coulomb potential. 
Rutherford was certainly a great scientist but, like many other great scientists before 
him, he had his fair share of luck. 


In fact, Rutherford’s luck ran deeper than you might think. It turns out that the 
Born approximation is valid for the Yukawa potential in certain regimes, but is never 
valid for the Coulomb potential! The difficulty stems from the long range nature of 
the Coulomb force which means that the plane wave solutions ġọ ~ e’X™ are never 
really good approximations to the asymptotic states. We will describe the correct 
treatment of the Coulomb potential in Section 6.4 where we will see that, although our 


approximation wasn’t valid, the result (6.67) is correct after all. 


6.3.3 The Born Expansion 


One can continue the Born expansion to higher orders. In compressed notation, the 
solution (6.63) takes the form 


y= oot fosveot | fasvasver+ | | fatvasvasven+... 


This has a natural interpretation. The first term describes the incident plane wave 
which doesn’t scatter at all. The second term describes the wave scattering once of 
the potential, before propagating by Gj to the asymptotic regime. The third term 
describes the wave scattering off the potential, propagating some distance by Gj and 
then scattering for a second time before leaving the region with the potential. In 
general, the term with n copies of V should be thought of as the wave scattering n 
times from the potential region. 


There’s a useful diagrammatic way to write the resulting scattering amplitude. It is 
given by 
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while each line describes an insertion of 
q a —l 
e * pi 
Meanwhile, for each internal line we include the integral 
1 d?q 
An J (2r)? 


Although we’re dealing with wave scattering, it’s tempting to think of the lines as 
describing the trajectory of a particle. Indeed, this diagrammatic picture is a precursor 
to Feynman diagrams that occur in quantum field theory, where there’s a much closer 
connection to the underlying particles. 


6.4 Rutherford Scattering 


“How can a fellow sit down at a table and calculate something that would 
take me — me — six months to measure in a laboratory?” 


Ernest Rutherford 


Historically, some of the most important scattering problems in particle physics 
involved the Coulomb potential. This is the problem of Rutherford scattering. Yet, 
as we mentioned above, none of the techniques that we’ve mentioned so far are valid 
for the Coulomb potential. This is mitigated somewhat by the fact that we get the 
right answer whether we work classically (6.22) or using the Born approximation (6.67). 
Nonetheless, this is a little unsatisfactory. After all, how do we know that this is the 
right answer! 


Here we show how to do Rutherford scattering properly. We want to solve the 
Schrodinger equation 


iv 4 2) y(r) = Ey(r) 


where A > 0 for repulsive interactions and A < 0 for attractive interactions. It will 
prove useful to rewrite this as 


(v +k? — z) u(r) =0 (6.68) 


r 


where, as usual, E = h?k?/2m while y = mA/R?k is a dimensional parameter which 
characterises the strength of the Coulomb force. 
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The Asymptotic Form of the Wavefunction 


Let’s start by understanding what the wavefunctions look like asymptotically. Repeat- 
ing the analysis of Section 6.2.3, the radial wavefunction R)(r) satisfies 


(< 2d y2 (+1) 20k 


aT | ) R,(r) =0 


T 
rdr r2 r 


Already here we can see what the issue is. At large distances, r + oo, the Coulomb force 
is more important than the angular momentum barrier. We saw in previous sections 


that when y = 0, the asymptotic form of the wavefunction is given by Rı(r) = e**" /r 
regardless of the value of l. However, when y 4 0 we have to revisit this conclusion. 


With the previous solution in mind, we will look for solutions which asymptotically 
take the form 


tikr+g(r) 


for some function g(r). Inserting this ansatz, we find that g(r) must satisfy 


dg dg\? _. dg 2yk 


But, for now, we care only about the asymptotic expression where the left-hand side is 
dominated by the last term. We then have 

d 

+i 9 = 7 as roo 

Cr F 
which is solved, up to some constant, by g = Fiylog(kr). Clearly this diverges as 
r — oo and so should be included in the asymptotic form. We learn that asymptotically 
the radial wavefunctions take the form 


etilkr=y log(kr)) 


r 


This extra logarithm in the phase of the wavefunction means that the whole framework 
we described previously needs adjusting. 


Note that this same analysis tells us that our previous formalism for scattering works 
fine for any potential V(r) ~ 1/r” with n > 2. It is just the long-range Coulomb 
potential that needs special treatment. 
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6.4.1 The Scattering Amplitude 


To compute the amplitude for Rutherford scattering, we don’t need any new conceptual 
ideas. But we do need to invoke some technical results about special functions. This 
is because the solution to the Schrödinger equation (6.68) can be written as 


wr) = gerry + iy) 1 Fy ( — iy; 1; i(kr —k- r)) 


where F; (a; b; w) is the confluent hypergeometric function, defined by the series ex- 
pansion 


a a(a+1)w? | a(at+1)(a+ 2) w’ 


F,(a;b;w) =1+4 tw 4 | 
LF, (a; b; w) o  B(b+1) 2! B+ 1)\(b+2) 3! 


We won’t prove that this is a solution to the Schrodinger equation. Moreover, the only 
fact we'll need about the hypergeometric function is its expansion for large |w|. For 
our solution, this is an expansion in 1/(kr —k-r) and so is valid at large distance, but 
not along the direction of the incident beam k. If we take k = kz, we have 


m~ piketiylog(k(r—z)) _ Y rd T iy) ikr—iylog(k(r—z)) 
ae -arien ais 


where the +... are corrections to both terms which are suppressed by 1/k(r — z). This 
is now very similar to our usual asymptotic form (6.61), but with the corrected phases. 
The first term describes the ingoing wave, the second term the scattered outgoing wave. 
We can therefore write 


ikz—iylog(k(r—z 
y(r) ~ elke tinlos(kr—2) 4 f(0) © 7 log(k(r—z)) 


r 


where the scattering amplitude is given by 


~ Vile) e vite) 1 
m= kT —-#H)r—z  2kT(1-— iy) sin?(6/2) (503) 


We learn that the cross-section is 


do mA \? 1 
an ~ HON = (Fa) sin4(6/2) 


This is the same result as we saw using the invalid Born approximation (6.67) and the 


same result that we saw from a classical analysis (6.22). This shouldn’t give you the 
wrong idea. In most situations if you use the wrong method you will get the wrong 
answer! The Coulomb potential is an exception. 
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Recovering the Hydrogen Atom 


There’s a rather nice exercise we can do with the scattering amplitude (6.69). When 
y < 0, the Coulomb potential is attractive and has bound states. Moreover, these 
bound states are simply those of the hydrogen atom that we met in our first course on 
quantum mechanics. From our earlier analysis, we should be able to recover this from 
the poles in the scattering amplitude. 


These arise from the gamma function T(z) which has no zeros, but has poles at 
z=0,-1,-2,.... The scattering amplitude therefore has poles when 
Al 
iph = fej > wih eH4.0 5... 
h? n 
For an attractive potential with A < 0, these poles lie along the positive imaginary 
k-axis, as they should. We see that they correspond to bound states with energy 


E _ ek mA? 1 
"9m 2n? 


This, of course, is the familiar spectrum of the hydrogen atom. 


6.5 Scattering Off a Lattice 


Finally, we come to an important question: how do we know that solids are made of 
lattices? The answer, of course, is scattering. Firing a beam of particles — whether 
neutrons, electrons or photons in the X-ray spectrum — at the solid reveals a char- 
acteristic diffraction pattern. Our goal here is to understand this within the general 
context of scattering theory. 


Our starting point is the standard asymptotic expression describing a wave scattering 
off a central potential, localised around the origin, 


ikr 
e 
ureei fikk] (6.70) 
r 
Here we’re using the notation, introduced in earlier sections, of the scattered momentum 


k= ke 


The idea here is that if you sit far away in the direction r, you will effectively see a wave 
with momentum k’. We therefore write f(k,k’) to mean the same thing as f(k; 6, Q). 
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Suppose now that the wave scatters off a potential which is localised at some other 


position, r = R. Then the equation (6.70) becomes 
etklr-R|] 


peja E aR ET 


For r + oo, we can expand 


r- R| = vr + R2-2r-RerVJ/1—-2r-R/r?wr—-t-R 


We then have 


ikr 
plr) ~ eR |e tick jee ae (6.71) 
r 


The overall factor is unimportant, since our interest lies in the phase shift between 
the incident wave and the scattered wave. We see that we get an effective scattering 
amplitude 


fr(k;t) = f(k,k’)et® 
where we have defined the transferred momentum 
q=k-—k’ 


Now let’s turn to a lattice of points A. Ignoring multiple scatterings, the amplitude is 
simply the sum of the amplitudes from each lattice point 


fick) =fick) > en (6.72) 
REA 


The sum A (q) = ope, e'*® has the nice property that it vanishes unless q lies in the 
reciprocal lattice A*. This is simple to see: since we have an infinite lattice it must be 
true that, for any vector Ro € A, 

A(q) = > eaR — ` cia (R-Ro) — et Ro A (q) 


REA REA 


This means that either e~*#®° = 1 or A(q) = 0. The former result is equivalent to the 
statement that q € A*. More generally, 


X eR = A(q) = V* X ôa- Q) (6.73) 
REA QEA* 


where V* is the volume of the unit cell of A*. We see that A(q) is very strongly 
(formally, infinitely) peaked on the reciprocal lattice. (We met this same sum when 
when discussing lattices in Lectures on Solid State Physics.) 
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The upshot of this discussion is a lovely result: there is scattering from a lattice if 
and only if 


k-k € A* (6.74) 
This is known as the Laue condition. If the scattered momentum does not satisfy 
this condition, then the interference between all the different scattering sites results 


in a vanishing wave. Only when the Laue condition is obeyed is this interference 
constructive. 


Figure 49: The Ewald sphere. Figure 50: Salt. 


Alternatively, the Laue condition can be viewed as momentum conservation, with 
the intuition that the lattice can only absorb momentum in A*. 


Solutions to the Laue condition are not generic. If you take a lattice with a fixed 
orientation and fire a beam with fixed k, chances are that there are no solutions to 
(6.74). To see this, consider the reciprocal lattice as shown in the left-hand panel of 
the figure. From the tip of k draw a sphere of radius k. This is sometimes known as 
the Ewald sphere and its surface gives the possible transferred momenta q = k — K. 
There is scattering only if this surface passes through a point on the reciprocal lattice. 


To get scattering, we must therefore either find a way to vary the incoming momen- 
tum k, or find a way to vary the orientation of the lattice. But when this is achieved, 
the outgoing photons k’ = kr sit only at very specific positions. In this way, we get to 
literally take a photograph of the reciprocal lattice! The resulting diffraction pattern 
for salt (NaCl) which has a cubic lattice structure is shown in the right-hand panel. 
The four-fold symmetry of the reciprocal lattice is clearly visible. 


6.5.1 The Bragg Condition 


There is an equivalent phrasing of the Laue condition in real space. Suppose that the 
momentum vectors obey 


k-k =Q € A* 


== 


Since Q is a lattice vector, so too is nQ for all n € Z. Suppose that Q is minimal, so 
that nQ is not a lattice a vector for any n < 1. Defining the angle 0 by k-k’ = k? cos 0, 
we can take the square of the equation above to get 


2k?(1 — cos 0) = 4k’ sin?(0/2)=Q? => 2ksin(@/2)=Q 


Figure 51: A quasi-crystal. Figure 52: DNA, Photograph 51. 


We can massage this further. The vector Q € A* defines a set of parallel planes in A. 
Known as Bragg planes, these are labelled by an integer n and defined by those a € A 
which obey a: Q = 27n. The distance between successive planes is 


_2n 
“<Q 


Furthermore, the wavevector k corresponds to a wavelength A = 2a/k. We learn that 


d 


the Laue condition can be written as the requirement that 


A = 2dsin(6/2) 


A 


Repeating this argument for vectors nQ with n € Z, we s e 
get 0/2% Kon 


nà = 2dsin(6/2) Figure 53: 


This is the Bragg condition. It has a simple interpretation. For n = 1, we assume 
that the wave scatters off two consecutive planes of the lattice, as shown figure. The 
wave which hits the lower plane travels an extra distance of 2x = 2dsin(@/2). The 
Bragg condition requires this extra distance to coincide with the wavelength of light. 
In other words, it is the statement that waves reflecting off consecutive planes interfere 
constructively. 
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The Bragg condition gives us licence to think about scattering of light off planes in 
the lattice, rather than individual lattice sites. Moreover, it tells us that the wavelength 
of light should be comparable to the atomic separation in the crystal. This means x- 
rays. The technique of x-ray crystallography was pioneered by Max von Laue, who 
won the 1914 Nobel prize. The Bragg law was developed by William Bragg, a fellow of 
Trinity and director of the Cavendish. He shared the 1915 Nobel prize in physics with 
his father, also William Bragg, for their development of crystallographic techniques. 


X-ray crystallography remains the most important technique to determine the struc- 
ture of materials. Two examples of historical interest are shown in the figures. The 
picture on the left is something of an enigma since it has five-fold symmetry. Yet 
there are no Bravais lattices with this symmetry! The diffraction pictures is revealing 
a quasi-crystal, an ordered but non-periodic crystal. The image on the right was taken 
by Rosalind Franklin and is known as “photograph 51”. It provided a major, and 
somewhat controversial, hint to Crick and Watson in their discovery of the structure 
of DNA. 


6.5.2 The Structure Factor 


Many crystals are described by a repeating ground of atoms, which each group sits on 
an underlying Bravais lattice A. The atoms in the group are displaced from the vertex 
of the Bravais lattice by a vector d;. We saw several examples of this in the Lectures 
on Solid State Physics. In such a situation, the scattering amplitude (6.72) is replaced 
by 


Fiattice(k, k’) = A(q) S(q) 


where 
S(a) =~ filk, ke 


We have allowed for the possibility that each atom in the basis has a different scattering 
amplitude f;(k,k’). The function S(q) is called the geometric structure factor. 


An Example: BCC Lattice 


As an example, consider the BCC lattice viewed as a simple cubic lattice of size a, 
with two basis vectors sitting at dı = 0 and dz = $(1,1,1). If we take the atoms on 
the points dı and də to be identical, then the associated scattering amplitudes are also 


equal: fi = fo =f. 
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Figure 54: A BCC lattice as cubic lattice Figure 55: The reciprocal as a cubic lat- 
+ basis. tice minus a basis. 


We know that the scattering amplitude is non-vanishing only if the transferred mo- 
mentum q lies on the reciprocal lattice, meaning 
27 


q= q (Mas Ma ns) n € Z 
This then gives the structure factor 
S(q) = f (62% + e'a) 


= f (1+ e7 Xm) = i 


We see that not all points in the reciprocal lattice A* contribute. If we draw the 


2 Son, even 
0 È niodd 


reciprocal, simple cubic lattice and delete the odd points, as shown in the right-hand 
figure, we find ourselves left with a FCC lattice. (Admittedly, the perspective in the 
figure isn’t great.) But this is exactly what we expect since it is the reciprocal of the 
BCC lattice. 


Another Example: Diamond 
A diamond lattice consists of two, interlaced FCC lattices with basis vectors dı = 0 
and də = $(1,1,1). An FCC lattice has reciprocal lattice vectors bj = **(—1,1,1), 
bə = &(1, —1,1) and b = *(1,1,—1). For q = 30, nib;, the structure factor is 

2 >> ni = 0 mod 4 
1+% J n;=1 mod 4 

0 >> ni = 2 mod 4 
1—i Jon; =3 mod 4 


S(q)=f (1 we ein /2) Lis) = 


6.5.3 The Debye-Waller Factor 


So far, we’ve treated the lattice as a fixed, unmoving object. But this is not realistic: 
the underlying atoms can move. We would like to know what effect this has on the 
scattering off a lattice. 
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Let’s return to our result (6.72) for the scattering amplitude off a Bravais lattice A, 


fk kj=fik k] > 


where f(k, k’) is the amplitude for scattering from each site, q = k — kK’, and R, € A. 
Since the atoms can move, the position R, are no longer fixed. We should replace 


R, > Ra + u(t) 


where u,, describes the deviation of the lattice from equilibrium. In general, this de- 
viation could arise from either thermal effects or quantum effects. In keeping with 
the theme of these lectures, we will restrict to the latter. But this is conceptually 
interesting: it means that the scattering amplitude includes the factor 


A(q) 2 >» eiaRn cidun 


which is now a quantum operator. This is telling us something important. When a 
particle — whether photon or neutron — scatters off the lattice, it can now excite a 
phonon mode. The scattering amplitude is a quantum operator because it includes all 
possible end-states of the lattice. 


This opens up a whole slew of new physics. We could, for example, now start to 
compute inelastic scattering, in which the particle deposits some energy in the lattice. 
Here, however, we will content ourselves with elastic scattering, which means that the 
the lattice sits in its ground state |0) both before and after the scattering. For this, we 
need to compute 


A(q) = >> et (ole © 0) (6.75) 


To proceed, we need to import some results from our discussion of phonons in the 
Lectures on Solid State Physics. For simplicity, let’s consider a simple cubic lattice so 
that the the matrix element above factorises into terms in the x, y and z direction. For 
each of these, we can use the formalism of one-dimensional lattice, in which we write 
the Fourier expansion of the displacement as 


i oh 
Un(t) = Xo(t) + oT >. a e7t(wit—kina) + al ei(wit—kina) (6.76) 


140 


where w is the natural frequency at which the / atom oscillates. 


— 243 — 


The normalisation \/h/2mw,N is for later convenience. Note the presence of A: this 
reflects the fact that the advertised convenience only becomes apparent in the quantum 
theory. This means that we treat the displacement un as a quantum operator. Corre- 
spondingly, we must also treat Xo, a; and al as quantum operators. The normalisation 
factor ensures that the usual position-momentum commutation relations for u,, and ùn 
translate into simple commutation relations for a; and af, 


[aia] = dv and [aay] = [aj,a},] = 0 


These are the familiar creation and annihilation operators of the harmonic oscillator. 
The interpretation is that al creates a phonon of momentum k; and frequency w/(k). 
More details of this can be found in the phonon section of the Lectures on Solid State 
Physics. 


Now we are in a position to compute (6.75). The matrix element (O/e’#%™0) is 
independent of time and is also translationally invariant. This means that we can 
evaluate it at t = 0 and at the lattice site n = 0. For a one-dimensional lattice with N 
sites, the expansion (6.76) gives 


oS sana (al talk) =A+Al 
k£0 


The operators A and At then obey 
h 
A, A] = — 
parea 2 2mNw(k) 
k#0 
Our goal now is to compute (Ojeia(4+4) 9). For this we use the BCH formula, 
ct Atl) = eit eid pa [At A] 

But the ground state of the lattice is defined to obey a;|0) = 0 for all /. This means 
that e’74|0) = |0}. We end up with the result 


hq? 


ig-uo |~ — -—-W(a) h = a: 
Ole?" 10). where W(q) D es 


k 


This is called the Debye-Waller factor. We see that the scattering amplitude becomes 
falk, k’) =e" f(k, k’)A(q) 


Note that, perhaps surprisingly, the atomic vibrations do not broaden the Bragg peaks 
away from q € A*. Instead, they only diminish their intensity. 
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